main is usually a function: January 2012

Saturday, January 28, 2012

Writing kernel exploits

Yesterday I gave a talk about writing kernel exploits. I've posted the slides [PDF]. Here is the original description:

Did you know that a NULL pointer can compromise your entire system? Do you know how UNIX pipes, multithreading, and an obscure network protocol from 1981 are combined to take over Linux machines today? OS kernels are full of strange and interesting vulnerabilities, thanks to the subtle nature of systems code. And the kernel's ultimate authority is the ultimate prize for an attacker.

In this talk you will learn how kernel exploits work, with detailed code examples. Compared to userspace, exploiting the kernel requires a whole different bag of tricks, and we'll cover some of the most important ones. We will focus on Linux systems and x86 hardware, though most ideas will generalize. We'll start with a few toy examples, then look at some real, high-profile Linux exploits from the past two years.

You will also see how to protect your own Linux machines against kernel exploits. We'll talk about the continual cat-and-mouse game between system administrators and those who would attack even hardened kernels.

Thanks again to SIPB for giving me a venue to talk about whatever I find interesting.

Thursday, January 19, 2012

Embedding GDB breakpoints in C source code

Have you ever wanted to embed GDB breakpoints in C source code?

int main() {
    printf("Hello,\n");
    EMBED_BREAKPOINT;
    printf("world!\n");
    EMBED_BREAKPOINT;
    return 0;
}

One way is to directly insert your CPU's breakpoint instruction. On x86:

#define EMBED_BREAKPOINT  asm volatile ("int3;")

There are at least two problems with this approach:

They aren't real GDB breakpoints. You can't disable them, count how many times they've been hit, etc.
If you run the program outside GDB, the breakpoint instruction will crash your process.

Here is a small hack which solves both problems:

#define EMBED_BREAKPOINT \
    asm("0:"                              \
        ".pushsection embed-breakpoints;" \
        ".quad 0b;"                       \
        ".popsection;")

We place a local label into the instruction stream, and then save its address in the embed-breakpoints linker section.

Then we need to convert these addresses into GDB breakpoint commands. I wrote a tool that does this, as a wrapper for the gdb command. Here's how it works, on our initial example:

$ gcc -g -o example example.c

$ ./gdb-with-breakpoints ./example
Reading symbols from example...done.
Breakpoint 1 at 0x4004f2: file example.c, line 8.
Breakpoint 2 at 0x4004fc: file example.c, line 10.
(gdb) run
Starting program: example 
Hello,

Breakpoint 1, main () at example.c:8
8           printf("world!\n");
(gdb) info breakpoints
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00000000004004f2 in main at example.c:8
        breakpoint already hit 1 time
2       breakpoint     keep y   0x00000000004004fc in main at example.c:10

If we run the program normally, or in GDB without the wrapper, the EMBED_BREAKPOINT statements do nothing. The breakpoint addresses aren't even loaded into memory, because the embed-breakpoints section is not marked as allocatable.

You can find all of the code on GitHub under a BSD license. I've done only minimal testing, but I hope it will be a useful debugging tool for someone. Let me know if you find any bugs or improvements. You can comment here, or find my email address on GitHub.

I'm not sure about the decision to write the GDB wrapper in C using BFD. I also considered Haskell and elf, or Python and the new pyelftools. One can probably do something nicer using the GDB Python API, which was added a few years ago.

This code depends on a GNU toolchain: it uses GNU C extensions, GNU assembler syntax, and BFD. The GDB wrapper uses the Linux proc filesystem, so that it can pass to GDB a temporary file which has already been unlinked. You could port it to other UNIX systems by changing the tempfile handling. It should work on a variety of CPU architectures, but I've only tested it on 32- and 64-bit x86.

Monday, January 9, 2012

Zombie 6.001 starts tomorrow!

The student-run revival of MIT's famous intro CS class starts tomorrow! 6.001 and its text SICP had a singular influence on the teaching of introductions to computer science — not to be confused with intro to programming, worthwhile though that subject may be. After the unfortunate demise of 6.001 at MIT, some former TAs reanimated the class as an intense four-week experience. As their description says:

Zombie-like, 6.001 rises from the dead to threaten students again. Unlike a zombie, though, it's moving quite a bit faster than it did the first time. Like the original, don't walk into the class expecting that it will teach you Scheme; instead, it attempts to teach thought patterns for computer science, and the structure and interpretation of computer programs. Three projects will be assigned and graded. Prereq: some programming experience; high confusion threshold.

I'm helping teach it this year, and it should be a lot of fun. You can follow along online or if you're in the area, come to lectures Tuesdays and Thursdays, 19:00 to 21:00 in 32-044 (that's MIT building 32, room 044).

main is usually a function