Unpacking Malware Requires Searching for Zero Padding

Recently we experimented with our generic unpacking heuristics. Our goal was to unpack a potentially malicious binary and dump the executable from memory to a file. During our experiments we saw a few unknown packers from which we successfully unpacked the binary; with these, however, we dumped the memory but we missed some code in the dumped file that was present at unpacking. We investigated the problem and identified that some packers unpack the code at zero padded space (appended in memory by the loader according to section alignment). Before moving on, let’s first discuss how the Windows loader maps the executable in memory.

Below are the simplified steps of the loader to map a binary in memory:

  • Read first page of the file, which includes DOS header, PE header, section headers, etc.
  • Fetch Image Base address from PE header and determine if that address is available; otherwise allocate another area  (relocation)
  • Allocate the space equal to SizeOfImage
  • Map the sections into the allocated area
  • Read information from the import table and load the DLLs
  • Resolve the function addresses and create Import Address Table (IAT)
  • Create initial the heap and stack using values from PE header
  • Create the main thread and start the process

The loader starts by reading the headers and allocating the memory of a size equal to the SizeOfImage. This allocation is linear; after that, based on the virtual address and virtual size of the sections, the loader maps the sections into the allocated memory. The virtual size is the actual size of the section, but due to the section alignment loader it may add additional zero padding to sections in memory in order to map the sections according to the section alignment. Actually this zero padding is basically not added by the loader, it comes free with the SizeOfImage allocation. The loader just reads the headers and creates sections accordingly on the allocated area. For simplicity, however, let’s just say that zero padding is added by the loader.

1st

 The size of the zero padding depends on the virtual address and virtual size of the sections within the section alignment. In normal circumstances the application executes the code from the code section, but in the case of packers the application usually executes code from different sections: the unpacking stub may be in one section and the unpacked code may be in another section. In most cases the code executed by the application will be under the virtual-size boundary. But in some cases the code may be executed from the zero padded space–as we saw in our experiments. Because the zero padded space is not the part of any section, it is not recorded by the headers. Normally when we dump the process into a file we read the headers and dump the process accordingly (for example, ollydump). In this case, though, we will miss the code that is unpacked at the zero padded space.

We saw in some cases that the code generated at the zero padded space was very sensitive to the injection of some code into another process–creating a new thread–so dumping this code into the file is a nice idea. We solved this problem by the following steps.

  • We check whether we are in the code at a zero padded space
  • If yes, then we look into the table holding the pointers of the virtual address of the previous section to find the virtual address of the next section

Example:

virtual address of first section      virtual address of second section

0×00401000                                              0×00403000

  • After that we identify from which section the zero padding space is currently being executed
  • Then we calculate the new size of that section by subtracting the virtual address of the previous section from the virtual address of next section

Example: 0×00403000 – 0×00401000 = 2000

  • After that we update the section header of the section with the new size and dump the process into a file

By using the preceding methods we extracted all of the hidden code from the binary. In some cases this is very useful for the detection and static analysis of an executable. In some samples we also detected significant variations in section alignment, such as 2,000 or 4,000, with very little data in the sections. This occurred probably just to make enough room for the unpacked code.