Mar 30 2017

Free Nintendo Switch emulators are fake

Fake emulators for newly released Nintendo console used as bait to get users to fill out survey scams and download potentially unwanted applications.

Mar 30 2017

Black box discovery of memory corruption RCE on


Robust evidence existed for the presence of a memory corruption based RCE (remote code execution) on servers. The most likely explanation for the evidence presented is the usage of an old ImageMagick which has known vulnerabilities, combined with lack of configuration lockdown. It's hard to be sure, though: see the section on the Box response below.

This blog post explores a different angle to vulnerability research from my normal work. We'll look at how to try and reliably determine the presence or absence of a known server-side memory corruption vulnerability using strictly black box techniques. Black box testing can be fun, because it's very scientific: come up with a hypothesis, devise an experiment, see if the results make sense.

Side notes: security and response

I had a sufficiently unusual experience talking to Box that it's worth a few dedicated notes. I found the experience good in place and awful in others.

The good: the three issues I reported all appeared to get fixed reasonably quickly, including one that was not a simple fix.

The awful: communications were painful, as if they were filtered through a gaggle of PR representatives and an encumbrance of lawyers. The current status is that I believe the issues are fixed -- again via black box testing, but no-one has really confirmed the issues existed in the first place. All I have is some language that says everything is fine and that the security posture was improved, without saying if my reports were accurate or rejected. Being slippery in researcher communications is not the way to build trust in your security program. I also note that Box is behind vs. its competitors due to the lack of a bug bounty program. To avoid a train wreck, open and honest researcher communications need to be addressed prior to the launch of such a program.

It's worth noting that I had an interaction with DropBox for similar reasons, at about the same time (separate blog post pending). The experience could not have been more different! DropBox were friendly, competent and forthcoming. Maybe that's where you want to store your files instead of Box, if security is a priority.

An inaccurate fingerprint

The target of this investigation is image thumbnailing. Online file storage services like Box, DropBox, Google Drive etc. typically support displaying of image thumbnails in the file list and also a preview window. In order to fingerprint the software used, we need to upload a diverse set of files and see if we find any behavioral clues.

In order to "trick" the thumbnailing process into revealing its full capabilities, we use a simple trick of taking unusual file types and renaming them to .png. Often, the thumbnailing process only cares that it thinks it sees a known file extension and then it will stuff the bytes into some process that ignores the file extension and uses header sniffing to work out what to really do.

Very quickly, I found that Box will thumbnail tons of weird and wacky formats. Including the following, and bonus points if you've heard of any: CIN, RLE, MAT, PICT. The list would have probably gone on but when you see a list like that it usually means one of two things.... ImageMagick or GraphicsMagick.

I noticed that a particular CIN file from the GraphicsMagick test set (input_rgb.cin) rendered very differently in my local installs of GraphicsMagick and ImageMagick. In this image below are renderings of the same CIN input file by GraphicsMagick-1.3.23 and ImageMagick-6.8.9. As you can see, GraphicsMagick renders with more extreme contrast.

Unfortunately, I inaccurately decided Box was using GraphicsMagick because the Box thumbnail looked more like the image on the left. This was a very lax determination because there's a lot of color variation depending on the software versions and also the tool and options used to produce or view the thumbnail. We'll correct the mistake later on in this post.

The vulnerability

Under the false belief that GraphicsMagick was in use, I had a look at recently fixed vulnerabilities in GraphicsMagick. The v1.3.24 release notes looked promising (May 2016), because they reference a lot of memory corruption fixes. In addition, my Ubuntu 16.04 LTS has v1.3.23, making testing easy. After a bit of consideration, this GraphicsMagick patch represented a good candidate for exploration. It's a fairly straightforward buffer overflow in the RLE decoder, with good control.

The overflow occurs because of a missed validation on the number of planes (where RGB would be 3 planes, for example) in the canvas allocation vs. the plane number requested to be written in the RLE decode protocol:

    rle_pixels=MagickAllocateArray(unsigned char *,number_pixels,
        case SetColorOp:
          // plane is attacker controlled.
          plane=(unsigned char) operand;
        case RunDataOp:
          // x, y and plane are attacker controlled. plane is not validated.
          for (i=0; i < (unsigned int) operand; i++)
            if ((y < image->rows) && ((x+i) < image->columns))

Side note: this is a really unusual image format, the Utah Raster Toolkit RLE (run length encoded) format. Here is a link to the home page, complete with 1990's look. And yes, I did download the original toolkit, apply both patches, and quickly read it for the presence of the same vulnerabilities. It does appear to have them. This software was written in the _1980_'s, so not really reasonable to expect otherwise. Since I'm not sure I ever found a security bug still live from the 80's, might as well assign: CESA-2017-0001.

The oracle

In security, an oracle is simply a measurable result that gives the attacker some useful information. For our oracle, we're simply going to use this: "did the uploaded image thumbnail successfully or not".

There are of course multiple reasons why an image thumbnail might fail. The image might simply not pass a validity check, or the image might cause a SEGV in some backend. Ideally, a backend SEGV would be indicated back to us via a different oracle (perhaps a 50x error code from the HTTP request for the thumbnail). Unfortunately, testing eventually showed that a failed thumbnail, for any reason, manifests to us as:

  • Multiple requests to over a period of time, leading to a long pause in the UX (throbbing icon) even though this is a "fail fast" case.
  • HTTP response codes of 200 OK.

Starting to construct a proof of vulnerability

To show we're on the right track, we're going to try and build an input file that will thumbnail successfully with the vulnerable code, and fail if the fix is in place. Ideally, the failure will be because of a deterministic check and not a memory corruption failure (which is not particularly deterministic from our vantage point). Let's look at the fixed code:

          for (i=0; i < (unsigned int) operand; i++)
            if ((p >= rle_pixels) && (p < rle_pixels+rle_bytes))

As it turns out, there is a subtle difference here that we can use. The vulnerable code lets the p pointer go out of bounds, but stops writing if x or y go out of bounds. The fixed code bails immediately if p goes out of bounds. So we can simply make an image that has a small number of pixels (say, 16x1), and then request a large RLE pixel run (say, 0xff). The vulnerable code will accept this, and clamp the out of bounds values. This is not a case where the vulnerable code will corrupt memory, but it is a case where the vulnerable code will exhibit different behavior. The fixed code will reject this case.

File: gm_intra_oflow_1_3_23.rle
Notes: as a bonus, this file also uses an out-of-bounds plane value. The value is chosen such that any validation would reject it, and also that the out-of-bounds behavior it causes will still remain within the bounds of the pixel buffer.
Result: thumbnailed successfully on Box. Thumbnailed successfully on Ubuntu 16.04's v1.3.23. Fails on v1.3.25. Fails on Ubuntu's ImageMagick v6.8.9.

This is fairly compelling evidence already. For our next test, we could proceed to try and cause a crash via any old heap overflow... but this is not going to be reliable. A crash vs. non crash is going to depend on heap state, which in turn will depend on versions of GraphicsMagick, versions of libc, versions of everything.

A deterministic heap overflow crash

To proceed, we're going to assume that Linux x64 and glibc malloc() might be in use on the backend, and use a quirk of this combination: allocations larger than a certain size (often 128kB or so) are allocated using mmap(). This will provide 4096 byte alignment and size. We can then calculate the size of any mapping created by our input file, and play tricks like writing a single byte past the end. This will either hit the adjacent mapping (may or may not crash depending on writability) or no mapping (crash). Our first attempt is: gm_oflow_mmap_chunk.rle, which tries to write off the end of a large allocation. It's only 24 bytes so let's look at it in its entirety:

This parses as follows:

52 CC:       header
00 00 00 00: top, left at 0x0.
fc 00 04 01: image dimensions 252 x 260
02:          flags 0x02
04:          4 planes (e.g. RGBA)
08:          8 bits per sample
00 00 00 00: no color maps, 0 colormap length, padding, padding
02 fe:       set the plane value to 0xfe
06 ff 41 00: write 0xff pixels of value 0x41
07:          end of image

Result: no crash on my local GraphicsMagick v1.3.23.

We were expecting a crash, so let's look at what happened. valgrind certainly sees the problem, it reports "Invalid write of size 1... 2 bytes after a block of size 262,080". In the debugger, we can confirm the out of bounds write, which happens to cross over into the adjacent mmap chunk which of course happens to be writable (otherwise we would have received a crash). Here are the mappings in question:

7ffff7dd7000-7ffff7dfd000 r-xp 00000000 fc:01 3674897                    /lib/x86_64-linux-gnu/
7ffff7edf000-7ffff7fcf000 rw-p 00000000 00:00 0 
7ffff7ff6000-7ffff7ff8000 rw-p 00000000 00:00 0 
7ffff7ff8000-7ffff7ffa000 r--p 00000000 00:00 0                          [vvar]

The mapping highlighted in red is actually a concatenation of the mapping we smashed off the end of, and some other mapping. What is that other mapping?

(gdb) x/8xa 0x7ffff7fbf000
0x7ffff7fbf000: 0x4100002b4187d7 0x417ffff141cb21
0x7ffff7fbf010: 0x417ffff14162a8 0x417ffff7415000
0x7ffff7fbf020: 0x4100004a41a988 0x417ffff141bf3b
0x7ffff7fbf030: 0x417ffff141de78 0x417ffff7415000

Well, it's chock full of pointers and you can see we've partially corrupted them by spraying some 0x41 values around. It appears that no used code path dereferences these particular pointers, otherwise we'd have seen a crash. Seems like a great avenue for exploitation, as we can expect this mapping layout to be reasonably repeatable. However, exploitation is not our current goal. Our current goal is to deterministically crash when we go out of bounds. Linux typically uses a top down and first fit algorithm for placing mmap chunks, so if we simply make the allocation bigger, it won't fit in its current place and will go elsewhere.

Our solution is to tweak our image size to be 255 x 16448, still with 4 planes. This causes an allocation of 16776960 bytes. Taking into account the glibc 16 byte header, and rounding the mapping size will be 0x1000000, with the result looking a bit like this:

7ffff0a60000-7ffff1a60000 rw-p 00000000 00:00 0 
7ffff1a60000-7ffff1d38000 r--p 00000000 fc:01 655698                     /usr/lib/locale/locale-archive

As can be seen, our allocation which we smash off the end of now backs up against a read only mapping so any such attempt will crash cleanly. Here's our resulting off-by-one file:

Result: Fails to thumbnail on Box and crashes with SEGV in v1.3.23 locally.

To go into detail on the calculations resulting in an off-by-one condition, let's look at the RLE protocol bytes:

02 f4:       set plane to 0xf4
03 fe:       set X to 254 (default Y is 0, which means bottom)
06 00 41 00: write 1 byte (0x00 + 1) of value 0x41

The calculation of the write offset is (16447 * 255 * 4) + (254 * 4) + 0xf4, which is 0xfffff0. Add in the 16-byte glibc header and the write offset, relative to the start of the mapping, is exactly 0x1000000, which is exactly off-by-one. Here's the crash locally:

=> 0x00007ffff7a7e0af : mov    %r13b,(%rcx)
r13            0x41 65
rcx            0x7ffff1a60000 140737247576064

There are other reasons this file might not thumbnail on Box. A failed thumbnail could be because we hit a backend server that was taking a shower, so we run a few tests across a few days to confirm it consistently fails to thumbnail. Another reason could be the large allocation, if the server has some limits on allocation sizes. So as a final very interesting test, we can change the 0xf4 value in the test file to 0xf3, leading to:

File: gm_oflow_mmap_chunk_oob0.rle
Notes: An off-by-zero file, i.e. writes perfectly at the very end of the mmap() allocation. This write is out-of-bounds regarding the actual size passed to malloc(), but in-bounds regarding the mmap() mapping.
Result: Thumbnails ok on Box and also locally with v1.3.23.

We now have some pretty compelling evidence. By changing a single byte in an input file, which has no effect other than to change an out-of-bounds write offset, we go from consistent success to consistent failure using our thumbnail oracle.

A more accurate fingerprint

In communications with Box, they were quick to deny using GraphicsMagick but very cagey regarding the actual software used. I don't think this was a particularly clever move: image thumbnailing software is fairly easy to fingerprint because each image decoder has its own set of observable decode capability, quirks, bugs and also vulnerabilities. And the inability to have an open conversation likely lowered the amount of benefit obtained.

That said, my hasty and incorrect earlier fingerprint attempt was a minor personal embarrassment, so I set out to more accurately fingerprint the software in use. Given the set of image formats supported, and the denial of usage of GraphicsMagick, the only other reasonable suggestion is ImageMagick. In response to my report, Box disabled all of the crazy / fringe decoders. Good move. One remaining decoder supported is PSD (Adobe Photoshop). In an attempt to fingerprint positively for ImageMagick, here's a PSD file with the following properties:

  • A v2 PSD file: supported by ImageMagick but not GraphicsMagick, gimp, etc.
  • Contains a ZIP (deflate, really) compressed channel. Supported by ImageMagick but not GraphicsMagick.
  • Declares greyscale (1 channel) images but tries to render a magenta pixel by referring to the presumably "out of bounds" green channel. ImageMagick gives an RGB image with magenta pixel, not greyscale!
  • Has deliberately incorrect length / size fields in places where ImageMagick doesn't seem to care.
Result: does not render in GraphicsMagick. Renders a single magenta pixel at offset 2x2 in ImageMagick.

Box: walks like an ImageMagick, quacks like an ImageMagick...


By careful construction of an input file, we've produced strong black box evidence of a memory corruption vulnerability in Box's image thumbnailing process. But we've also determined that Box are not using GraphicsMagick, but likely using ImageMagick instead. Are these two facts compatible? Yes. GraphicsMagick forked from ImageMagick back in 2002. At that time, the codebase was extremely buggy and therefore since 2002, both GraphicsMagick and ImageMagick have suffered from a stream of the same vulnerabilities, with little co-ordination, and patches arriving at very different times.

ImageMagick had the same vulnerability we've been discussing, but it was fixed a couple of years ago. This suggests that Box were running a 2+ years old version of ImageMagick, loaded with vulnerabilities, without any particular attack surface reduction efforts, and without binary ASLR enabled (to be discussed in a future post).

We can also conclude that the split between GraphicsMagick and ImageMagick has led to a bit of a mess. If you look at all the vulnerabilities fixed in GraphicsMagick, not all of them are fixed in ImageMagick and visa versa. This is a potentially great source of bugs; I'm not even sure if you'd call them 0days or 1days.
Mar 28 2017

Ransomware Families Use NSIS Installers to Avoid Detection, Analysis

Malware families are constantly seeking new ways to hide their code, thwart replication, and avoid detection. A recent trend for the delivery of ransomware is the use of the Nullsoft Scriptable Install System (NSIS) with an encrypted payload. The list of the most common families using this technique is diverse and includes Cerber, Locky, Teerac, Crysis, CryptoWall, and CTB-Locker.

Rarely do we see multiple families consistently using the same methods for packing. In this case the payload is dependent on the installer for execution, and the decrypted malware payload never touches the disk. Using the NSIS packaging method makes the malware harder to collect and find using bulk collection techniques. Incoming samples may contain only the DLL responsible for unpacking without the encrypted payload or the NSIS installer. In this post, we will look at how the delivery mechanism works, why it is used, and the challenges it poses to researchers attempting to investigate the malware.

Why is this delivery method popular?

This attack vector begins its life as the payload of a spam downloader. An unsuspecting user opens an email attachment containing a malicious JavaScript or Word document. The malicious installer (detected as NSIS/ObfusRansom.*) downloads, launches, and drops a DLL file and an encrypted data file in %TEMP%. The installer then loads the DLL responsible for decrypting and executing the encrypted payload. The unpacker DLL steals five APIs from the NSIS installer’s import address table. The DLL then reads the encrypted file into memory, goes to a random hardcoded offset within the file, and decrypts the additional APIs it needs to decompress, write to memory, and execute the encrypted malware payload. This level of dependency makes static analysis, emulation, and replication more difficult and creates a delivery system in which the decrypted ransomware never touches the disk. During our analysis of these packers we did not find an example of the decrypted malware executable submitted to any of the well-known sample processing sites, such as VirusTotal.

Packer execution


The preceding flowchart summarizes the basic execution flow of this packer, which has been found with various levels of obfuscation, though all functionally equivalent. We have chosen a less obfuscated sample (MD5: F9AE740F62811D2FB638952A71EF6F72) to make this technical explanation simpler.

Most versions also attempt some code flow obfuscation to delay static analysis. The two common methods of code flow obfuscation we have seen are a QueueUserAPC combined with an alertable Sleep call:



Or structured exception handling combined with a divide by 0.



Neither of these methods are unique to this delivery method nor particularly hard to see when performing a static analysis. Once inside the main function, the malware first works on deobfuscating  three scrambled strings. These in some cases can be seen just by running “strings” on the sample. as shown in the following screenshot:



These strings in most cases contain “Kernel32,” a Microsoft API call, and the name of the encrypted file dropped by the installer. The following is a sample of Kernel32 being decrypted. All three of these strings are deobfuscated in a similar way.

Deobfuscating algorithm:


Memory with obfuscated string:


Memory with deobfuscated string:


After the strings have been deobfuscated, the malware next crafts a pointer to the installer’s memory space and saves the offsets for FirstThunk and OriginalFirstThunk. (A “thunk” is an automatically generated piece of code to assist in calling another subroutine.) Essentially, the OriginalFirstThunk is the import name table and FirstThunk is the import address table.



The unpacker DLL then walks through the OriginalFirstThunk looking for the names of the five APIs it needs to steal and saves their addresses directly from the corresponding FirstThunk entries. This loop uses some basic logic based on string size and the position of letters to accurately grab the APIs it needs.












These five stolen APIs are used to read the encrypted file into memory, where it will later decrypt the second layer of APIs it needs.

The packer next prepares the payload. When the parent NSIS installer runs, one of the files it drops is an encrypted and compressed file. This file is the main payload of the malware, which the packer is preparing to launch. As we mentioned, our research shows this payload can be one of a wide variety of malware, including several ransomware variants.

The malware first opens a file handle to the payload using the CreateFile API. The payload name and extension is one of the strings already deobfuscated.



Value of ECX (filename):


The malware obtains the size of the file needed for decrypting and reading the file. With the file size, a new chunk of memory is allocated for the file to be read into memory:


With the encrypted file now stored in memory, the malware begins processing this file by decrypting the APIs’ names. Each sample we studied had the location of both the API names and decryption key hardcoded in the sample. We also found that both of these could typically be found in the first 0x1FFF bytes of the file. The decrypting of the API strings is done in a loop using simple arithmetic.


Encrypted APIs:


Depending on the sample, this code can be extremely obfuscated. We have decompiled this sample and simplified the decryption algorithm used in this loop to only the relevant lines shown below:

api = *(api_base + counter);
key = ~*(counter + randomoffset);
*(api_base + counter) = api & key | ~key & ~api;
}while ( counter < 330 );

We can see the “encryption” here is extremely basic. Some samples we found had slightly different decryption algorithms; however, they were always very basic arithmetic operations against a stored key. This function does a bitwise AND once with the key and the encrypted value and then again with those values NOTed. The results are then bitwise ORed. In our group of samples, the strings being decrypted were always the same; thus the number of iterations remained constant, at 0x14A (330).

Decrypted APIs:


The next major task is the decryption of the payload itself. The following shows the memory location of the encrypted payload:


The entire file does not go through the decryption process, only the executable itself. The malware uses the size gathered above from GetFileSize and a hardcoded value to determine the amount of bytes to decrypt.


The decryption algorithm for the payload in our samples was identical to that of the APIs decryption algorithm.


There is a small noticeable difference from the API decryption process based on when the loop is finished. As shown above, ebx holds the amount of bytes to decrypt, which is now acting as a counter.

Decrypted payload:


With the file and APIs now decrypted, the malware decompresses its payload. The malware authors used standard Windows APIs to perform their compression and use them again after decryption through a call to RtlDecompressBuffer. In this API the “2” pushed onto the stack represents the type of compression used. According to Microsoft documentation, 2 stands for LZ decompression.


With the payload fully decrypted and decompressed in memory, we can now dump the fully functional standalone payload using Windbg’s “.writemem” function. This allows us to study the payloads and determine whether they are known ransomware variants; however, theses specific payloads had not yet been seen by common malware research sites such as VirusTotal.

Now setup is needed to execute this payload in memory. The decrypted payload never touches the disk, helping reduce the probability of detection. The first step is to call CreateProcess in a suspended state. The malware executes in this process:


Following the CreateProcess API, the malware uses a standard process-hollowing technique. VirtualAlloc, GetThreadContext, ReadProcessMemory, and NtUnmapViewofSection are used in preparation to write its payload into the new thread. WriteProcessMemory copies the unencrypted payload into the new thread. Next a Sleep and ResumeThread call start the thread. Once the thread has started, the malware immediately terminates the parent thread.



The core functionality of these samples is very simple and does not exhibit any new behavior; yet the delivery method presents a new and interesting challenge. The delivery of ransomware in NSIS installers with an encrypted payload has proven to be a unique and effective method for delivering a wide range of malware. Currently all the samples explored have contained only variants of ransomware; however, we can easily imagine other families of malware using this technique. We have seen a wide range of anti-emulation methods, strong code-obfuscation techniques, and variance in hardcoded values. The encryption used for the fields and APIs is generally very weak and not designed to be the main challenge for reversing or detection. It is likely that either one threat actor is distributing multiple forms of ransomware, or multiple threat actors are using the same group to distribute their ransomware.

The post Ransomware Families Use NSIS Installers to Avoid Detection, Analysis appeared first on McAfee Blogs.

Mar 28 2017

Necurs: Mass mailing botnet returns with new wave of spam campaigns

Unexplained three-month absence resulted in a seven-fold decrease in rate of emails containing malware.