How a Vulnerable Picture Upload Can Be Exploited Using Manipulated Picture Files

23. February 2017

Abstract
There are numerous websites allowing users to upload picture files. In trying to allow the users to upload different file extensions, an attacker is sometimes able to find weaknesses in these image uploads. This article describes an attack which circumvents weak file name restrictions and injects PHP code through a resizing and metadata stripping process.

Introduction
Picture uploads are used on many sites found on the internet. There are multiple reasons why a website would allow a user to upload a picture to it. For example, there are some sites allowing the user to set some kind of avatar, while others allow the user to present photos.
With every feature added to a website, there is the need for hardening this functionality. This holds true for simple things as allowing users to post content, but is also important for allowing users to upload files. One may think a picture upload is less dangerous than an arbitrary file upload. But it comes with all the dangers that a general file upload has – like a Path Traversal or maybe an arbitrary file upload allowing users to escalate their privileges on the system.
In the following a technique will be presented which allows an attacker to take over a machine running a webserver with activated PHP and using a vulnerable image file upload.

In the following an example of a (self-written) web application which is prone to such an attack is presented. Although this example has some academic character, this vulnerability and similar ones have been seen out there in the wild.

Scenario
In the following we are going to assume that a website contains a vulnerable image upload which has the following design flaws:
• An attacker may choose a file extension with specific restrictions ( [a-z]* )
• The webserver has PHP enabled on all files with a specific extension ( .php )
Apart from these flaws the image board is hardened. Especially an attacker may not:
• Upload arbitrary files. Only jpg and png are allowed and checked with identify for being a valid image file
• Choose an arbitrary directory
• Choose arbitrary file names
• All non-image related content of the image is stripped (a.k.a. metadata)


Attacker Model

In the following we will assume that we have an external attacker without prior knowledge of the system. We are going to describe all steps necessary to discover the vulnerabilities and how to circumvent several aspects of the “security” features.

Exploit Idea
The idea of the exploit is to execute PHP code on the target system. This can be done by uploading a file with a .php extension. Since use the web server allows us to choose the file extension, we may name it .php. Because the program “identify“ of ImageMagick is used to check if the given file is a valid image file, it is not possible to supply just PHP code. However, the uploaded image file is resized and stripped. Therefore, the PHP code must be embedded in a valid picture and has to survive a resizing and stripping of the image.
So in short form:
1. Analyze how the image resize operation works
2. Discover places in the image file which can be used to inject code
3. Upload modified picture with injected code
4. Profit

Exploit
The general idea is to inject code at wisely chosen places. It is widely known that php code can be supplied using EXIF meta data. But one of the security features of the webserver mitigates this problem by stripping all non-image data from the pictures.

We are going to inject the php code into jpg files. This file format was chosen because it is widely accepted and used on the internet. So any website maintainer who wants their user to upload pictures will probably allow jpg. JPG uses the JPEG File Interchange Format (JFIF). JFIF divides the file into several segments. Every segment contains specific image information. For example, there could be segments with EXIF metadata and so on. In the following we can see the same JPG file. One time it is saved with full metadata and one time it is converted with ImageMagick using the –strip argument to remove a majority of unneeded image data.


In the graphic the x and y position encode the binary position in the file. So from left to right and top to bottom you look at every byte of the image file. Every byte is colored according to the segment of the JFIF it is in. As you may, see the image on the left side is the unstripped one. It contains much more types of segments than the stripped one. The black color can be ignored because it is just padding for the heatmap.
The interested reader may refer to a mapping between the numbers in the colormap and the JFIF segments in the appendix.
The large red part seen in the picture is the Start of Scan (SoS) segment. In this segment the compressed image data is stored.
From here on, there are multiple ways to proceed with the exploit. One way would be to analyze the code used to strip the segments of the image and try to find a way to inject some part of the image which does not get altered when the image is resized.
For ImageMagick, for example, it is the following code:

As you can see, there are multiple sections deleted from the image. But it seems to be a blacklisting of “bad” segments instead of a whitelisting of good segments. Therefore, an attacker may find a segment which survives this stripping and craft a special jpg to deliver the php code.

This article describes a different approach to bypassing such image resizing and stripping techniques using a pure blackbox approach when it comes to the image format. A python script is used to bruteforce all positions in the image which do not change when the image gets stripped and converted.

We start with a simple image with all metadata and of any size and upload it to the imageboard. The image board will strip all metadata from the image and resize it to the resolution it accepts. Therefore, the image will not change in size when it is uploaded again. But the coefficients of the JPG will change and vary from upload to upload. Therefore, it is not possible to just inject some code into randomly chosen coefficients – or is it?

For our exploit we will use an ordinary picture taken on a cloudy summer day. As we can see in the following image, it is uploaded succesfully to the webserver to a directory called /images/. The filename has been replaced with something looking like a md5 hash.

We will now start to find all valid injection points inside this file. At first we check if we can upload files with the php extension:

We can see multiple things from this answer. At first we can see from the URL that we indeed sucessfully uploaded a file which the webserver later saved ending in .php. Apart from that, we can see that the browser has not displayed the image. This is an indicator that the webserver sent a Content-Type header with the response to the GET request. This could be due to the fact that the php processor ran through this file and set the Content-Type to “text/html”.

We will now try to create an image with php code which survives the resize process.
In the following listing you can see a snippet from the python code used to create the exploit pictures. The for loop basically goes through every byte in the file, injects the php code there and later on resizes and checks if the picture is valid and still contains the code.

Using this tool you can see that it quickly finds some injection points:

Looking at the distribution of the injection places in the different jiff sections, we can see that injections are inside of the SoS segment, i.e. in the compressed image data. Please note that due to the chosen exploit category of 50, the colorspace has moved, as can be seen in the colormap on the left side. So the 22, which is the SoS category, is now green instead of red. In simple words: the place where the code can be sucessfully injected is marked white in the following picture.

By uploading the modified picture we can see the following result:

At the first look, the upload does not look very promising apart from the artifacts on the right image and the bottom right side.
Opening the picture itself looks similar to the old php file. With a closer look, one may see that the title of the page is set to phpinfo().

Scrolling a bit down the website, the successful execution of the php code can be seen.

An attacker can easily modify the payload to compromise the whole system with this exploit. For security reasons, the complete exploit is not shown here.

Mitigation
File uploads are prone to security vulnerabilities in general. Therefore, special care should be taken when implementing them. The filename should be whitelisted including the extension of the file. Also it is important to check the access rights of the created file, the location to where the file is saved and the whole process the user-supplied data goes through. It is also a good idea to make mime type checking of the supplied data. But this example shows, it is possible to create a valid image file containing php code. A hardened system therefore has many of such security features and only the combination of all of these make the system secure. When one feature is missing, such as the enforcement of a special filename due to compatibility reasons, the whole security model can fail and allow an attacker to compromise systems.

Conclusion
This example shows how a malicous jpg file can be crafted using simple techniques and without deep knowledge of the file format, leading to a remote code execution. A bruteforce of all positions of a stripped picture with 512×384 leads to 363 possible injection points for the phpinfo() payload.
The exploit needs a misconfiguration of the target system allowing an attacker to choose a file extension which gets evaluated by the webserver.

Appendix
JFIF Parser color map
The number is the number used in the heat map.
(„2″,“APP0“, „JFIF Tag“)
(„3″,“SOF0“, „Baseline DCT“)
(„3″,“SOF2″,“Extended sequential DCT“)
(„4″,“SOF3“, „Progressive DCT“)
(„5″,“SOF4“, „Lossless (sequential)“)
(„6″,“SOF5″,“Differential sequential DCT“)
(„7″,“SOF6″,“Differential progressive DCT“)
(„8″,“SOF7″,“Differential lossless (sequential)“)
(„9″,“JPG“,“reserviert für JPEG extensions“)
(„10″,“SOF9″,“Extended sequential DCT“)
(„11″,“SOF10″,“Progressive DCT“)
(„12″,“SOF11″,“Lossless (sequential)“)
(„13″,“SOF13″,“Differential sequential DCT“)
(„14″,“SOF14″,“Differential progressive DCT“)
(„15″,“SOF15″,“Differential lossless (sequential)“)
(„16″,“DHT“,“Definition der Huffman-Tabellen“)
(„17″,“DAC“,“Definition der arithmetischen Codierung“)
(„18″,“DQT“,“Definition of quantisation table“)
(„19″,“APP1″,“EXIF data“)
(„20″,“APP14″,“Copyright?“)
(„21″,“COM“,“Comments“)
(„22″,“SoS“,“Start of Scan“)
(„23″,“EoI“,“End of Image“)
(„50″,“RCE“,“RCE Injection Point“)

1 https://www.owasp.org/index.php/Path_Traversal
2 https://www.owasp.org/index.php/Unrestricted_File_Upload

About usd Security Publications

In order to protect businesses against hackers and criminals, we always have to keep our skills and knowledge up to date. Thus, security research is just as important for our work as is building up a security community to promote the exchange of knowledge. After all, more security can only be achieved if many individuals take on the task.

Our usd Akademie and HeroLab are essential parts of our security mission. We share the knowledge we gain in our practical work and our research through training courses and publications.

In this context, the usd HeroLab publishes a series of papers on new vulnerabilities and current security issues.

Always for the sake of our mission: “more security.”

Also interesting:

Security Advisories on hugocms and Gitea

The pentest professionals at usd HeroLab examined hugocms and Gitea during their pentests. Thereby, several vulnerabilities were identified. The vulnerabilities were reported to...

read more

Security Advisory on AXIS Webcam

The pentest professionals at usd HeroLab examined the AXIS Webcam (P1364) during their pentests. Our professionals discovered a vulnerability (cross-site request forgery) in the...

read more