All file formats follow the same evolution.
- They start by grouping together some static content, with some nifty features for presenting and editing that data. Think text files, bitmaps, RTF documents… The file format is reasonably easy to understand, and the reader/writer is so simple that it would take a bad programmer to create vulnerabilities.
- Then, they start including plug-ins that let them handle more and more types of contents. This lets you include an image inside an HTML page or an Excel spreadsheet in a Word document. This relies on many plugins for getting things right. It sometimes happens that a given plugin contains a security fault that can then be exploited, for instance Internet Explorer had an issue with images in PNG format. The user would visit a page, that page would display an image, and the computer would be contaminated.
- Finally, they need to become interactive, so they include a scripting language of some sort. Excel has a macro system that uses Visual Basic, HTML includes Javascript…
The PDF format followed the same process to end up where it is now. In addition to any static document data (text, vector and raster images) and extended content (flash animations, videos, reader extension signatures) a PDF also contains short JavaScript that let authors create interactive documents. This means a PDF document on your desktop can:
- Accept user input (such as checkboxes or text fields). The input can be saved to the disk if the reader supports it and allows it (Acrobat Reader, used by the vast majority of computer users, only allows saving a file if its author purchased a reader extensions license and signed the PDF file with it).
- Change its layout at will, for instance displaying a “spouse” page only if the “spouse” checkbox was ticked.
- Be cryptographically signed, and display information about who signed it. This kind of signature is actually accepted as valid legal proof in many countries.
- Compute a scannable bar code from user input, so that it can be printed, then scanned on the other side with reduced error rates.
- Send data over the internet. It can even send itself as an attachment to an email.
Needless to say, with all these features, there are inevitably going to be some exploitable security issues in the mix. Being a popular program, like Acrobat Reader, only increases the number of black hat hackers looking for vulnerabilities. One of these is the recent CVE-2009-4324 from December 2009. There are many types of vulnerabilities, their common feature being that they end up executing arbitrary operations on the computer (as opposed to the safe operations Acrobat Reader normally allows). These operations are usually to download or install trojans, so that the attacker can gain complete control over the computer.
CVE-2009-4324 is of the use-after-free kind. In short:
- it creates a resource (which uses some memory),
- it frees (destroys) the resource to recycle its memory,
- it writes something to that memory,
- it attempts to use the resource
Normally, the program should stop at step four and say “you can’t use the resource, it’s been destroyed”. A bug can cause it to believe that the resource is still there. The programmer probably assumed that the memory still contained a valid resource and did defend against the memory containing something else… and accessing that as if it were a valid resource executes some code that the attacker wanted to execute. Bingo.
In the case of CVE-2009-4324, this happens as part of the Doc.media.newPlayer method which, for performance reasons, was not completely implemented in Javascript—a bug in some Javascript code can cause the document to misbehave, but it cannot do anything that the Javascript couldn’t do on its own. Those parts that were written in a lower-level language, with access to the computer, contained the exploited bug.
The bug causes the processor to start executing code at a different memory location. In an ideal hacker world, that location would be precisely where some nasty code is present. Buffer overflows, when used to rewrite pieces of the stack, do allow such deterministic jumps. However, CVE-2009-4324 only allows a jump to an undetermined location.
The hacker solution is to use heap spray. The basic idea is that you have a short piece of code you want to execute (the payload). You create a block from that payload by adding no-ops (machine instructions that say “skip me”) before the payload. Then, you create lots of these blocks in memory, and trigger the exploit.
The exploit causes the computer to jump to an undetermined memory location. If it falls within the no-op section of any of the blocks you’ve created, you win: the computer skips over the no-ops, reaches the payload and executes it. If not, the program will crash. Too bad…
Hi. I'm Victor Nicollet,
Recent Comments