Please consider donating:


HITB2012AMS Day 2 – Taint Analysis

Automatically Searching for Vulnerabilities: How to use Taint Analysis to find Security Flaws

(by Alex Bazhanyuk (not present) and Nikita Tarakanov, Reverse Engineers, CISS)

Nikita explains they have been working on reversing binaries and auditing source code for a long time.   Alex currently works on the BitBlaze work, and moved to the US to be able to work on security research in a better way.  The presentation is based on work done by Alex and Nikita a while ago, before Alex moved from Ukrain to the US.

Nikita, an independent researcher, enjoys reversing kernels.

Rps20120525 114925 899

The agenda for the talk contains the following topics:

  • Taint Analysis
  • BitBlaze theory
  • SASV implementation
  • Lulz Time
  • Pitfalls
  • Conclusion

Taint Analysis

Nikita explains that they mainly focused on IDA Pro plugins and BitBlaze (Vine + utils, TEMU + plugins). Nikita explains that BitBlaze needed customization to work properly.

Most people look for vulnerabilities by fuzzing, generate mutation cases, etc.  Nikita explains that, when the protocol implements crypto, CRC checks or uses unknown formats, fuzzing might not be very easy.   A better way is to use taint analysis.     From a taint source perspective, you can taint network input/output, keyboard input, memory, disk, function output, etc.   The idea is to follow the tainted data and trace how the application behaves when processing the tainted data.

There are a couple of ways to perform taint analysis:

Static taint analysis : analysis performed over multiple paths of a program (mostly performed within IDA Pro).  It’s typically performed on a control flow graph, where statements are nodes, and there is an edge between nodes if there is a possible transfer of control.

Dynamic taint analysis.  To perform dynamic taint analysis, the researchers used BitBlaze.   It will allow you to automatically extract security-related properties from binary code.  It was build as a unified binary analysis platform for security, leverages recent advances in program analysis, formal methods, binary instrumentation, and can greatly decrease the amount of time to find/detect exploitable conditions.


BitBlaze contains of a couple of components. It has an emulator, and taint analysis engine and a semantics extractor, made available to plugins via a TEMUAPI interface.  TEMU is based on older versions of QEMU making it slow and buggy.  TEMU is just used to perform tracing.

VINE is an intermediate language, sits in between the tracing (TEMU) and the output (graphs, logs, etc). Nikita dives into some details about the IL and STP.

SASV Components

To set up the SASV environment, they used

  • Temu
  • Vine
  • STP
  • IDA Plugins (Dangerousfunctions, IndirectCalls, ida2sql (zynamics))  to find calls to dangerous functions, find indirect jumps and calls, and to load idb into mysql
  • iterators – wrapper for temu, vine, stp
  • various publishers (for DeviceIOControl etc)

Rps20120525 120638 505

To optimize the SASV experience, Nikita explains, the minimum goal is to get maximum coverage of dangerous code.  The max goal is to have max coverage of all of the code.

The basic SASV algorithm contains the following steps:

  • First, using IDA plugins, the dangerous places in the app are identified.
  • Using publishers, they invoke the targeted code and start using TEMU to trace.
  • Trace -> appreplay -> IL
  • IL -> change path algo – IL’  (change symbolic execution)
  • wputil -> stp’ code
  • stp
  • repeat

Rps20120525 121159 904

There are some disadvantages. the definition of vulnerabilities is difficult and things can be very very slow, depending on the required functionality, and overhead introduced by hooking functions. On top of that, if you’re tracing big applications, the trace log file might be huge, and appreplay may not even to use it.

To enhance the performance of the process, Nikita says, it would probably be a good idea to get rid of the QEMU layer altogether… but it would be a huge task to do so.

Nikita continues by explaining that automated exploit generation would require you to build primitives (within the correct exploitation state), deal with a lot of exploit mitigations… and that EIP control does not mean you can build a weaponized exploit nowadays.  It would require the automation of finding memory disclosures as well. :)

Unfortunately the flow of this talk was a bit slow. With lots of time spent on the BitBlaze components and Intermediate Language, the speaker had to rush a bit at the end, which was a pity (because I had the impression it had more interesting content than the first part of the presentation).


© 2012, Peter Van Eeckhoutte (corelanc0d3r). All rights reserved.

Comments are closed.

Corelan Training

We have been teaching our win32 exploit dev classes at various security cons and private companies & organizations since 2011

Check out our schedules page here and sign up for one of our classes now!


Want to support the Corelan Team community ? Click here to go to our donations page.

Want to donate BTC to Corelan Team?

Your donation will help funding server hosting.

Corelan Team Merchandise

You can support Corelan Team by donating or purchasing items from the official Corelan Team merchandising store.

Protected by Copyscape Web Plagiarism Tool

Corelan on Slack

You can chat with us and our friends on our Slack workspace:

  • Go to our facebook page
  • Browse through the posts and find the invite to Slack
  • Use the invite to access our Slack workspace
  • Categories