4,062 views
HITB2012AMS Day 2 – Attacking XML Processing
Attacking XML Processing
Dressed in a classy Corelan Team T-Shirt, Nicolas Grégoire kicks off his presentation by introducing himself. Nicolas has been asked by a customer to audit some XML-DSig applications 18 months ago and found a number of bugs. This triggered him to do more research on this topic.
This technology is present in a LOT of applications today.
XML 101
Nicolas continues by explaining some basics about the eXtensible Markup Language, how XML documents are structured and what the purpose is of using multiple namespaces. Namespaces are there to avoid ambiguities, but you can also use namespaces to trigger some specific features. You can, for example, call PHP code from xslt.
In a valid XML document, you can find more than just data. You can find XSLT code, define some grammar, and include processing instructions. Parsers should be aware of this to avoid issues.
XML is used in a lot of technologies (svg, soap, xml-rpc, xslt, xkms, saml, wsdl, rest, and so on). Microsoft Lync online service uses XML to enable dial-in conferencing. W3C uses it in it’s online xslt 2.0 service, allowing you to upload files, potentially leading to java code execution on the server side (not tested). The following simple google dork might get you more information about other services : inurl:”xslurl=http”
When auditing an app that parses xml, you should ask yourself the following questions, Nicolas continues:
- What are the vectors used for xml data ?
- Is the data being processed ? If so, by who/where (client/server/gw) ?
- What is the attack surface ?
- What are the processing points. If you can submit data, does it get executed or not ? If you can submit grammar, will it resolve external entities ? If you can provide xslt code, check what extensions are available (to access databases, or run java code)
An interesting example is : Wikipedia allows you to upload an svg file, and transforms it into a png. In other words, it parses and converts the svg into an image.
Nicolas explains that all demos in the presentation are based on Atom feeds. He wrote a couple of Feed readers one using perl, another one in PHP and a third one in JSP / Java.
Encapsulation
XDP is a container for PDF/XFA documents. Nicolas used a 3 year old vulnerability (cooltype) and attempted to avoid AV detection by using encapsulation. By modifying the metasploit module for this particular exploit, he managed to evade all 43 AV engines.
Temporary DoS
By creating a temporary DoS condition, you may be able to detect the fact that something is being processed in a black box audit scenario (similar to the benchmark trick for SQL injection). He demonstrated a couple of ways that would lead to the allocation of multiple Gigs of Ram.
XXE Exploitation
XML External Entitiies is probably the most common XML vulnerability, Nicolas explains. You could basically specify a filename (/etc/passwd) in an entity, and every time it gets used, the output gets replaced by the contents of the file. RESTlet, Yandex, OpenOffice, SharePoint, DotNetNuke, IceWarp are just a couple of applications that speak REST and might be vulnerable to XXE attacks.
XXE attacks can be very powerful. You can easily hit the internal network, do banner grabbing (by using something like ssh://ip:22 for example) and do blind attacks. In certain scenarios, you can use the file handler to steal ntlm hashes or “pass the hash”, or list directories to get more information. Depending on the available extensions, there’s a lot more you can do. Successful exploitation depends on
- XML parser
- the OS
- programming language used
- application specific features (for example, use the fact that php returns base64 encoded data to read a file that contains null bytes)
XSLT Exploitation
The purpose of XSLT is to transform xml into something else. XSLT is Turing Complete. The main use of XSLT is to display XML to humans, but it can also be used to extract data, convert between formats. You can find XSLT parsers in Web Apps, browsers, Database servers (Oracle for example), Word Processors, XML-DSig. Nicolas played with a bunch of apps and discover bugs in Xalan-J, Sablotron, libxslt, transformix, XT, Adobe, Oracle-C, 4Suite, Altova.
First, he performed some basic mutation based fuzzing. He basically took a bunch of XSLT engines. Get a bunch of input files (download from the internet), use a diversifier (Radamsa), set up monitoring… and go. He explains that he typically takes 5000 files, and let’s radamsa create 10000000 files. To monitor, he used valgrind and AddressSanitizer. Using this approach, he found a couple of bugs (most of them are not patched yet) in Mozilla Firefox, Webkit, Opera, Oracle (ORA-07445). To trigger the xslt parser in Oracle, you can use the code in the screenshot below. At the bottom of the screenshot, you can see it can lead to control of FP:
The problem with XSLT is that it is a functional language. There is no loop functionality (while, for, …) and every variable is read only. This complicates brute forcing and reading STDOUT. You can, however, use a XSL for-each location wrapper to read something from another file. This simulates a “for” loop. If you combine this with SQL extensions, you would be able to attack internal databases and dump tables.
The “while” loop, needed to read stdout, is a bit more complex to implement. The idea is to use a XSLT Loop Compiler (by @obqo). It exposes
Combining a source with commands you want to run, the code that includes the loops (to execute and return the output), you can run arbitrary commands on a remote system and retrieve the output.
On dec 30 2011, at BerlinSides, Nicolas claimed that he couldn’t find a way to execute Java in XSLT, but he was contacted by @mihi42 who shared some details on how to do it. Nicolas ends the presentation by demonstrating how to get meterpreter shells by embedding php/java code in xlst files. The metasploit module, to create xslt files for PHP and Java, should be released shortly.
PHP Meterpreter :
Java :
Conclusion
XML is everywhere. You should understand that it’s more that just data. DTD and XXE attacks have been known for more than 10 years, and the offensive side is progressing quickly (because XML is increasing popularity)
You can get more info about his work, the results of his research and some source code on http://xhe.xwiki.org (which, ironically, is based on a Wiki that uses REST (a.o.) and IS vulnerable to certain types of attacks).
Brilliant work !
© 2012, Peter Van Eeckhoutte (corelanc0d3r). All rights reserved.
Comments are closed.
Corelan Training
Check out our schedules page here and sign up for one of our classes now!
Donate
Your donation will help funding server hosting.
Corelan Team Merchandise
Corelan on Slack
You can chat with us and our friends on our Slack workspace: