THE ART OF COMPUTER VIRUS RESEARCH AND DEFENSE [Electronic resources] نسخه متنی

5.2. Memory-Resident Viruses

A much more efficient class of computer viruses remains in memory after the initialization of virus code. Such viruses typically follow these steps:

The virus gets control of the system.

It allocates a block of memory for its own code.

It relocates its code to the allocated block of memory.

It activates itself in the allocated memory block.

It hooks the execution of the code flow to itself.

It infects new files and/or system areas.

This is the most typical pattern, but several other methods exist that do not require all of the preceding steps. On single-tasking operating systems such as DOS, only a single-user application can run at any one time; any other program code needs to make itself TSR (Terminate-and-Stay-Resident). DOS offers a variety of services in the form of interrupts to develop TSR code.

A typical example of a TSR program on DOS is a clock application to display the time on the screen during the execution of any single program. Because all applications share a single "thread" of execution, any program can easily interfere with any other in more than one way. Indeed, even the code of DOS, some system data structures, device drivers, or interfaces can accidentally be changed by a buggy user application, which can lead to catastrophic system crashes and corruptions.

Here is an anecdote about that. The first version of the Borland Quattro spreadsheet program for DOS was developed in 100% Assembly in Hungary. An interesting situation occurred during the development of the project. Sometimes during the execution of a loop, the control flow took the opposite direction than expected. The code did not explain why that would happen. It turned out that a loaded clock program on the system occasionally flipped the control flow because it modified the direction flag but in some cases forgot to set it back afterward. As a result, the not-intentionally-malicious clock program could easily do harm to the contents of spreadsheets and other programs. Of course, the bogus clock program was a TSR (Terminate-and-Stay-Resident).

The point is that DOS applications are not separated or walled up from each other in any way. Malicious code can take advantage of this kind of system very easily. On standard DOS, the processor is used in a single mode, and therefore any program has the privilege to modify any other program's code in the physical memory, which is addressable up to 1MB long (with some computers capable of accessing an extra high memory area of 64KB above that).

5.2.1. Interrupt Handling and Hooking

DOS programs use DOS and BIOS interrupts for system services. In the past, on microcomputers, programmers typically transferred control to a BIOS-based entry point, so programmers needed to keep such entry points in mind. The interrupt vector table (IVT) simplifies the programmer's task on the IBM PC for several reasons. Using the IVT, programs can refer to functions by their interrupt number and service number. As a result, hard-coded addresses to services need not be compiled into the program code. Instead, the INT x instruction can be used to transfer control to a service via the IVT.

Figure 5.1 illustrates how a typical boot virus such as Brain installs itself into the execution flow by hooking the BIOS disk handler.

Figure 5.1. A typical boot virus hooks INT 13h.

Boot viruses typically hook the INT 13h BIOS disk interrupt handler and start to monitor its functions, wait for diskette access for read and write, and during such operations write their code (or part of it) into the boot sector of the diskettes.

On DOS the IVT is placed at the beginning of physical memory at 0:0. The table holds the segment and offset values of each interrupt, so each entry in the table occupies four bytes. Thus INT 21h's vector can be found at 0:84h in memory. Table 5.1 shows common interrupts and their typical use by computer viruses.

Table 5.1. Typical Interrupts Used by Computer Viruses
INT ID	Function Category	Offset in IVT	Intercepted/Used by Virus Code
INT 00	Divide Error CPU Generated	0:[0]	Anti-Debugging, Anti-Emulation
INT 01	Single Step CPU Generated	0:[4]	Anti-Debugging, Tunneling, EPO
INT 03	Breakpoint CPU Generated	0:[0Ch]	Anti-Debugging, Tracing
INT 04	Overflow CPU Generated	0:[10h]	Anti-Debugging, Anti-Emulation (caused by an INTO instruction)
INT 05	Print Screen BIOS	0:[14h]	Activation routine, Anti-Debugging
INT 06	Invalid Opcode CPU Generated	0:[18h]	Anti-Debugging, Anti-Emulation
INT 08	System Timer CPU Generated	0:[20h]	Activation routine, Anti-Debugging
INT 09	Keyboard BIOS	0:[24h]	Anti-Debugging, Password stealing, Ctrl+Alt+Del handling
INT 0Dh	IRQ 5 HD Disk (XT) Hardware	0:[34h]	Hardware level Stealth on XT
INT 10h	Video BIOS	0:[40h]	Activation routine
INT 12h	Get Memory Size BIOS	0:[48h]	RAM size check
INT 13h	Disk BIOS	0:[4Ch]	Infection, Activation routine, Stealth
INT 19h	Bootstrap Loader BIOS	0:[64h]	Fake rebooting
INT 1Ah	Time BIOS	0:[68h]	Activation routine
INT 1Ch	System Timer Tick BIOS	0:[70h]	Activation routine
INT 20h	Terminate Program DOS Kernel	0:[80h]	Infect on Exit, Terminate Parent
INT 21h	DOS Service DOS Kernel	0:[84h]	Infection, Stealth, Activation routine
INT 23h	Control-Break Handler DOS Kernel	0:[8Ch]	Anti-Debug, Non-Interrupted Infection
INT 24h	Critical Error Handler DOS Kernel	0:[90h]	Avoid DOS errors during Infections (usually hooked temporarily)
INT 25h	DOS Absolute Disk Read (DOS Kernel)	0:[94h]	Disk Infection, Stealth (Gets to INT 13 however)
INT 26h	DOS Absolute Disk Write (DOS Kernel)	0:[98h]	Disk Infection, Stealth (Gets to INT 13 however)
INT 27h	Terminate-and-Stay Resident (DOS Kernel)	0:[9Ch]	Remain in memory
INT 28h	DOS IDLE Interrupt DOS Kernel	0:[A0h]	To perform TSR action while DOS program waits for user input
INT 2Ah	Network Redirector DOS Kernel	0:[A8h]	To infect files without hooking INT 21
INT 2Fh	Multiplex Interrupt Multiple use	0:[BCh]	Infect HMA memory, Access Disk Structures
INT 40h	Diskette Handler BIOS	0:[100h]	Anti-Behavior Blocker
INT 76h	IRQ 14 HD Operation Hardware	0:[1D8h]	Hardware Level Stealth on AT and above

The x86 family of processors has the capability to store 256 different interrupts in the IVT.

Information about the preceding interrupts (and many others) is available in the Ralf Brown Interrupt List, which offers 3,000 pages of further details. Initially, the available information about interrupts was minimal, but The Interrupt List became an essential guide for DOS virus researchers over the years and has increased understanding of undocumented interrupts.

The resident virus technique clearly has a major advantage over direct-action viruses. Resident viruses can easily infect new objects "on the fly" whenever you access them on your system. Furthermore, such viruses also can hide themselves easily using stealth techniques.

In eastern European countries, such as Bulgaria, such information was very hard to get in the pre-Internet days. In fact, programmers in such countries typically reverse-engineered DOS to figure out such details. Not surprisingly, many advanced Bulgarian viruses used such functions as DOS 2+ internal service calls such as "Get List of Lists" (INT 21hAH=52h), for tunneling, leaving wet-behind-the-ears virus researchers to wonder what the virus actually did with them.

5.2.2. Hook Routines on INT 13h (Boot Viruses)

An interrupt is typically used with a set of registers that define subfunction identifiers and pointers to data structures. For instance, the INT 13h takes its subfunctions in AH. To read the disk, someone must set the following registers:

AH = 2
AL = Number of sectors to read
CH = Cylinder
CL = Sector
DH = Head number
DL = Drive number
ES:BX = Pointer to allocated data buffer

The memory must be allocated first, and the disk needs to be reset beforehand. In fact, the diskettes are usually slow and do not spin quickly enough, requiring a couple of extra reads, with disk resets in between. It is nice to know that the hard disk numbers start at 80h (with bit 7 set).

When the interrupt is executed, the return address is pushed to the stack. When the called interrupt handler (or chained handlers) returns with an IRET instruction, it will use the return address from the stack. The interrupt handler also can return with an RETF instruction.

Boot viruses are naturally curious about the AH values passed into INT 13h. By presenting a new interrupt handler in the IVT, the virus code can easily monitor this value with a set of compare (CMP) instructions and take action according to the value.

Typically, boot viruses first save the original INT 13h handler upon execution:


MOV     AX,[004C] ; Offset of INT 13h
MOV     [7C09],AX ; Save it for later use
MOV     AX,[004E] ; Segment of INT 13h
MOV     [7C0B],AX ; Save it for later use

And boot viruses generally allocate memory just below the top 640KB boundary by manipulating the BIOS DATA area at segment 40h and by changing the 40h:13h (0:[413h]) word value that holds the top available memory. When this value is changed, no memory allocation will be possible for any programs above a newly set limit, usually one or a couple of KBs less than the previous value.

Next, the virus copies its code to the "allocated" block and hooks the INT 13h handler. It is interesting to note that viruses such as Stoned hook INT 13h before they relocate their code to the memory, which is set as the new handler. Obviously, the virus expects that no other disk reads can take place during boot time so that the code will not crash.

Hooking a handler is therefore as simple as setting new values in the IVT.


MOV     [004C],AX      ; Set new INT 13h Offset in IVT
MOV     [004E],ES      ; Set new INT 13h Segment in IVT

The new handler of Stoned is shown in Listing 5.1.

Listing 5.1. A New Handler Installed by Stoned


PUSH    DS        ; Save DS to stack
PUSH    AX        ; Save AX to stack
CMP     AH,02     ; Disk Read?
JB      Exit      ; Jump to Exit if Below
CMP     AH,04     ; Disk Verify?
JNB     Exit      ; Jump to Exit if Not Read/Write
OR      DL,DL     ; Diskette A: ?
JNZ     Exit      ; Jump to Exit if Not
XOR     AX,AX     ; Set AX=0
MOV     DS,AX     ; Set DS=0
MOV     AL,[043F] ; Read Diskette Motor Status
TEST    AL,01     ; Is motor on in Drive A:?
JNZ     Exit      ; Jump to Exit if Not
CALL    Infect    ; Attempt infection
Exit:
POP     AX        ; Restore AX from top of stack
POP     DS        ; Restore DS from top of stack
CS:
JMP     FAR [0009]; Jump to Previously Saved Handler

Obviously, it would be unethical to illustrate the virus with more code, but the previous code should give you a good idea of hooking in general. It also shows computer virus research from the perspective of code analysis. In the past, we typically commented code on printed paper, line by line. Eventually, the prints became far too long, and analysis of code appeared to be a 100-meter tournament, so to speak. Fortunately, great tools such as IDA (the Interactive Disassembler) came to the rescue (which will be discussed in Chapter 15, "Malicious Code Analysis Techniques").

5.2.3. Hook Routines on INT 21h (File Viruses)

File viruses typically hook INT 21h on DOS and it is commonly done using the INT 21h sub-functions 35h and 25h, Get and Set Interrupt vectors, respectively. Not all viruses, however, need to change INT 21h's vector in the IVT itself. An example of a virus that does not change the INT 21h vector is Frodo, written in Israel in 1989. Frodo does not hook the INT 21h vector using normal methods. Instead, the virus modifies the real entry point of INT 21h by placing a jump instruction to the entry point of the handler to its own handler.¹, used full file stealth techniques a few months earlier than Frodo, but Frodo made the technique famous.)

By intercepting INT 21h subfunctions, Frodo can hide file changes from DOS programs, even when they read from the file. The virus is sophisticated enough to show the original file content instead.

Let's look at the INT 21h vector on a Frodo-infected DOS system, using DEBUG.


C:\>DEBUG (We enter to DEBUG.)

We dump the INT 21h vector, which holds the value 19:40EB (segment:offset in memory). Even to a trained eye, this value is not suspicious at all and looks normal. This is because memory is typically filled from the lower segments toward the higher ones, and so "segment 19" might be in DOS itself, or even before it, pointing to a low memory segment.


-d 0:84 l4
0000:0080               EB 40 19 00
.@..

Next, we take a look at the handler with the unassembly command from the address we found in the IVT (see Listing 5.2).

Listing 5.2. The Jump (JMP) Instruction to Frodo's Hook Routine


-u19:40eb
0019:40EB EAD502209E    JMP    9E20:02D5  ; Jump to VIRSEG:02d5
0019:40F0 D280FC33      ROL    BYTE PTR [BX+SI+33FC],CL
0019:40F4 7218          JB    410E
0019:40F6 74A2          JZ    409A
0019:40F8 80FC64        CMP    AH,64
0019:40FB 7711          JA    410E
0019:40FD 74B5          JZ    40B4
0019:40FF 80FC51        CMP    AH,51
0019:4102 74A4          JZ    40A8

The preceding code seems strange. Although we can see usual CMP (compare) instructions, the jump instruction at the entry point takes the control flow to 9E20:02D5. This code is patched there by the virus itself to take control in a sophisticated manner.

Finally, we can take a look at a fraction of the entry-point code of Frodo. Another unassembly command reveals the virus code in memory, as shown in Listing 5.3.

Listing 5.3. The Hook Routine Entry of Frodo


-u9e20:02d5
9E20:02D5 55         PUSH    BP      ; Save BP
9E20:02D6 8BEC       MOV     BP,SP   ; Set BP to SP
:
:
9E20:02F3 53         PUSH    BX      ; Save BX
9E20:02F4 BB9002     MOV     BX,0290 ; Set BX to Function Table
9E20:02F7 2E         CS:
9E20:02F8 3A27       CMP     AH,[BX]    ; Is this one hooked?
9E20:02FA 7509       JNZ     0305       ; Check all entries
9E20:02FC 2E         CS:                  ; Found a match
9E20:02FD 8B5F01     MOV     BX,[BX+01] ; offset of hook
9E20:0300 875EEC     XCHG     BX,[BP-14] ; Set the address to
;"return" to
9E20:0303 FC         CLD
9E20:0304 C3         RET              ; Run the hook routine
9E20:0305 83C303     ADD      BX,+03     ; Get Next Entry
9E20:0308 81FBCC02   CMP      BX,02CC    ; Are we at the end?
9E20:030C 72E9       JB       02F7       ; If not, compare

Frodo is very tricky, however. Instead of using a simple switch statement using compare (CMP) instructions, the virus uses a table at offset 290 of the virus segment. The virus transfers control to the subfunctions of INT 21h according to the table. The bold characters in the next table are DOS subfunctions, followed by the offset where the subhandlers are located. Let's dump the memory from virus segment:290.


-d9e20:290
9E20:0290  30 7C 07 23 4E 04 37 8B-0E 4B 8B 05 3C D5 04 3D   0|.#N.7..K..<..=
9E20:02A0  11 05 3E 55 05 0F 9B 03-14 CD 03 21 C1 03 27 BF   ..>U.......!..'.
9E20:02B0  03 11 59 03 12 59 03 4E-9F 04 4F 9F 04 3F A5 0A   ..Y..Y.N..O..?..
9E20:02C0  40 8A 0B 42 90 0A 57 41-0A 48 34 0E 3D 00 4B 75   @..B..WA.H4.=.Ku

².

Table 5.2 shows some of the early, in-the-wild viruses with their common interrupt hook distributions and infection characteristics.

Table 5.2. Common Interrupt Hook Distributions in Early Computer Viruses
Virus Name	Infection Characteristics	Hooked Interrupts
Brain	DBR, Stealth	INT 13h
Stoned	DBR, MBR	INT 13h
Cascade	COM, Encrypted	INT 1Ch, INT 21h
Frodo	COM, EXE, Stealth	INT 1, INT 23h, INT 21h
Tequila	Multipartite: EXE, MBR, Oligomorphic, Stealth	INT 13h, INT 1Ch, INT 21h
Yankee_Doodle	COM, EXE	INT 1, INT 1Ch, INT 21h

5.2.4. Common Memory Installation Techniques Under DOS

In this section you will learn about the most common techniques computer viruses use to install themselves in memory of a DOS machine. Because DOS does not have any memory protection, viruses can easily manipulate any area of memory. Fortunately on a DOS system, viruses cannot effectively hide themselves in memory. This is because physical memory is continuous, and short pieces of virus code can easily be found. This is why memory stealth viruses are unknown on DOS, although a number of techniques exist that attempt to install virus code in unusual places of memory to confuse antivirus products that look for virus patterns only on the code path of certain interrupt handlers or certain areas of memory (up to 640KB, but not beyond that limit).³.

5.2.4.1 Self-Detection Techniques in Memory

A common technique of self-recognition in memory is based on the use of "Are you there?" calls. Boot viruses typically do not use this technique because they only load once during the booting of the system. However, other viruses that infect files need to hook the system only once, so the virus hooks an interrupt or file system and returns specific output for special input registers. The newly executed copies of the virus can check if a previous copy is installed by calling this routine. The memory resident copy answers the call, "Yes, I am here. Do not bother to install again." Table 5.3 contains some examples of this from DOS systems.

Table 5.3. "Are You There?" Call Examples in Early Computer Viruses
Virus Name	"Are you there?" Call	Return Values
Jerusalem	INT 21h AH=E0h	AX=0300h
Flip	INT 21h AX=FE01h	AX=01Feh
Sunday	INT 21h AH=FFh	AX=0400h
Invader	INT 21h AX=4243h	AX=5678h
Nomenklatura	INT 21h AX=4BAAh	Carry Flag is Cleared

On other systems such as Windows, viruses often use ram semaphores, such as a global mutex, that they set during the first time the virus is loaded. This way, the newly loaded copies can simply quit when they are executed.

Windows 95 viruses that hook the file system in kernel mode often have similar installation checks to DOS viruses. In some cases, viruses hook I/O port access and return values on these virtual I/O ports. The W95/SK virus got its name from such an I/O port routine. The virus hooks access to I/O port 0x534B (SK) and returns 0x21 (!) when this I/O port is read. Other viruses might examine the content of the memory at a specific location, check for the existence of a filename that is created as a flag, and so on.

Early antivirus products used these calls to detect viruses in memory. Specific monitor programs were also written to simulate "are you there calls" of viruses. Such solutions tricked viruses into believing that their malicious code was already installed in memory; thus the viruses never loaded again actively on the system. Such methods, however, are not general enough to be useful; only virus variantspecific antivirus tools might use them.

5.2.5. Stealth Viruses

Stealth viruses always intercept a single function or set of functions in such a way that the caller of the function will receive manipulated data from the hook routine of the virus. Therefore, computer virus researchers only call a virus "stealth" if the virus is active in memory and manipulates the data being returned.

Virus writers always attempt to challenge users, virus researchers, and virus scanners. Some techniques, such as antiheuristics and antiemulation, were only invented by virus writers when scanners started to get stronger; however, stealth viruses appeared very quickly.

In fact, one of the first-known viruses on the PC, Brain (a boot virus), was already stealth. Brain showed the original boot sector whenever an infected sector was accessed and the virus was active in memory, hooking the disk interrupt handling. This was in the golden days when Alan Solomon (author of one of the most widely used virus-scanning engines) was challenged to figure out what exactly was going on in Brain-infected systems.

The stealth technique also quickly appeared in DOS file infector viruses. This method was a sure way for a virus to go unnoticed for a relatively long period of time. In fact, in the DOS days, users would remember sizes of system files in an attempt to apply their own integrity checking. By knowing the original size of a file such as COMMAND.COM, the command interpreter was halfway to success in finding an on-going infection.

According to how difficult it was to find a virus in files and what kind of method was used, virus researchers started to describe the techniques differently. The following sections depict the most common stealth techniques: semistealth, read stealth, full stealth, cluster and sector -level stealth, and hardware-level stealth.

5.2.5.1 Semistealth (Directory Stealth)

We call a virus semistealth if it hides the change of file size but the changed content of the infected objects remains visible via regular file access. The first known semistealth virus, called Eddie-2, was written in Bulgarian virus factories⁴.

The semistealth technique requires the following basic attack strategy:

Virus code is installed somewhere in memory.
The virus intercepts file functions such as FindFirstFile or FindNextFile using FCB.
It infects files of a constant size (usually).
It marks infected files with a flag.
When an already infected file is intercepted, the virus reduces the file size in the returned data.

Because such viruses need to determine quickly if a file is already infected, the easiest approach is to set a special marker on the file date/time stamp. One of the most popular methods was first seen in the Vienna viruses (although this virus was a direct-action type and thus did not use the trick in conjunction with stealth). Vienna sets the seconds field of the infected file's time/date stamp to an "impossible" value of 30 (which means "60 seconds") or 31 (which means "62 s0econds"). This is because the MS-DOS time/date stamp is stored as a 32-bit value. The lower 5 bits (0-4) of the time/date stamp store the seconds in "compressed" form. The real seconds are divided by 2. Thus, a stored 2 translates to 4 seconds, and 29 translates to 58. However, 5 bits is enough to claim "60" and "62" seconds, which viruses could use as an infection marker.

Because FindFirstFile and FindNextFile return this information in a data structure, the infection marker is readily available when the hook routine of a stealth virus calls the original function to get proper data about the file. So there is not much overhead to figure if the file is already infected, which is an advantage for the attacker. The data structure is manipulated for the file size, and the false data with reduced file size is returned.

Semistealth is not a very common technique on modern operating systems, such as 32-bit Windows. Nevertheless, the first documented Win32 virus, W32/Cabanas, used semistealth (or so-called directory stealth).

5.2.5.1.1 VxDCallINT21_Dispatch Handler

This technique was introduced by W95/HPS⁵. W95/HPS monitors 714Eh, 714Fh LFN (long file name) FindFirst/FindNext functions, which is mandatory under Windows 95 from the virus's point of view. The actual implementation of the stealth handler is unique. The virus patches the return address of FindFirst/FindNext functions on the fly on the stack to its own handler. This handler checks that the actual program size is divisible by 101 without a remainder, and if so, the virus opens the program with an extended open LFN function and then reads the virus size from the last four bytes of the infected program and subtracts this value as a 32-bit variable from the original return value of FindFirst/FindNext on the stack. Finally, it returns to the caller of the function.

In this way, the virus can hide the file size differences from most applications while the size of the virus body should not be a constant value.

5.2.5.1.2 Hook on Import Address Table (IAT)

This method was introduced by W32/Cabanas and is likely to be reused in new Win32 viruses. The same technique can work under most major Win32 platforms by using the same algorithm. The hook function is based on the manipulation of the IAT. Because the host program holds the addresses of all imported APIs in its .idata section, all the virus has to do is replace those addresses to point to its own API handlers.

First, Cabanas searches the IAT for all the possible function names it wants to hook: GetProcAddress, GetFileAttributesA, GetFileAttributesW, MoveFileExA, MoveFileExW, _lopen, CopyFileA, CopyFileW, OpenFile, MoveFileA, MoveFileW, CreateProcessA, CreateProcessW, CreateFileA, CreateFileW, FindClose, FindFirstFileA, FindFirstFileW, FindNextFileA, FindNextFileW, SetFileAttrA, or SetFileAttrW.

Whenever it finds one, it saves the original address to its own jump table and replaces the .idata section's DWORDs (which holds the original address of the API) with a pointer to its own API handlers.

Consider dump, shown in Listing 5.4, to illustrate hooked GetProcAddress(), FindFirstFileA() functions.

Listing 5.4. Hooking the IAT


.text (CODE)
0041008E E85A370000   CALL 004137ED
004137E7 FF2568004300 JMP [00430068]
004137ED FF256C004300 JMP [0043006C]
004137F3 FF2570004300 JMP [KERNEL32!ExitProcess]
004137F9 FF2574004300 JMP [KERNEL32!GetVersion]
.idata (00430000)
00430068 830DFA77 ;-> 77FA0D83 Entry of new GetProcAddress
0043006C A10DFA77 ;-> 77FA0DA1 Entry of new FindFirstFileA
00430070 6995F177 ;-> 77F19569 Entry of KERNEL32!ExitProcess
00430074 9C3CF177 ;-> 77F13C9C Entry of KERNEL32!GetVersion
NewJMPTable:
77FA0D83 B81E3CF177 MOV EAX,KERNEL32!GetProcAddress ; Original
77FA0D88 E961F6FFFF JMP 77FA03EE ;-> New handler
.
.
77FA0DA1 B8DBC3F077 MOV EAX,KERNEL32!FindFirstFileA ; Original
77FA0DA6 E9F3F6FFFF JMP 77FA049E ;-> New handler

GetProcAddress is used by many Win32 applications to make dynamical, instead of import address table-based ("static") calls. When the host application calls GetProcAddress, the new handler of the virus first calls the original GetProcAddress to get the address of the requested API. Afterward, it checks whether the function is a KERNEL32 API and whether it is one of the APIs that the virus needs to hook. If the virus wants to hook the API, it returns a new API address that will point into the hook table (NewJMPTable). Thus the host application will also get an address to the new handler.

W32/Cabanas is a directory stealth virus: during FindFirstFileA, FindFirstFileW, FindNextFileA, and FindNextFileW, the virus checks for already-infected programs. If a program is not infected, the virus will infect it; otherwise, it hides the file size difference by returning the original size of the host program. Because the cmd.exe (Command Interpreter of Windows NT) uses the preceding APIs during the DIR command, every uninfected file will be infected (if the cmd.exe was infected previously by W32/Cabanas).

5.2.5.2 Read Stealth

The read stealth technique is an attack strategy that is a bit more advanced. Read stealth shows the original content of an infected object using content simulation, usually by intercepting seek and/or read functions only.

In fact, the first stealth viruses, such as Brain, use the read stealth technique. The virus simply intercepts any access to the first sector of diskettes. When the first sector is accessed and it is not infected, the virus infects it and stores the original sector elsewhere on the diskette. When an application attempts to read the infected DBR, the virus reads the originally stored DBR sector and returns that to the caller. As a result, programs accessing "the boot sector" believe that the fake sector is the true one. See Figure 5.2 for an illustration.

Figure 5.2. A read stealth computer virus.

Evidently, read stealth on the diskette is one of the simplest stealth methods. Virus writers have also implemented read stealth in file infectors on DOS. The virus does not need to do muchjust intercept read and seek access in files, returning the simulated content of the file instead. For example, a prepender virus can easily intercept the open request to any infected file. Whenever any application attempts to read the content of the infected file, the virus can easily seek the position where the original file header starts. The caller application will read the host application's content without any hesitation.

5.2.5.3 Read Stealth on Windows

You might wonder what happened to read stealth viruses on Windows systems. Do you happen to know any Windows users who remember the size of any Windows application? Who would pay attention to that in these days, when typical applications are so huge that they hardly fit on a diskette? This appears to be the primary reason why there have been only a few attempts so far to develop stealth viruses on 32-bit Windows systems. The first read stealth virus on Windows 9x, W95/Sma⁶, was discovered in June of 2002, about seven years after the discovery of the first 32-bit virus on Windows. Evidently, development of stealth techniques on Windows did not mature as quickly as on DOS for several reasons.

When I initially attempted to replicate the Sma virus on my test system, it was only a few minutes before I figured that I was "in the Matrix" of the virus. The "Matrix" had me, so seemingly I could not replicate it.

First I believed that I had replicated the virus. I knew that because the size of my goat files changed on the hard disk of my replication machine. I then copied the infected files to a diskette to move them over to my virus research machine. Surprisingly enough, I copied clean files. I repeated the procedure twice before I started to suspect that something was just not right with W95/Sma.

I used my Windows Commander tool on the infected machine to look into the file. Sure enough, there was nothing new in the file. In fact, the file was bigger, but nothing seemed to be appended to it. Then I accessed the file on the diskette one more time. Suddenly, the size of the file changed on the diskette also. I quickly inserted the diskette into my virus research system and saw that W95/Sma was in there. Gotcha!

The virus attempts to set the second field of the infected PE files to 4 to hide its size in such specially marked, infected files. However, there is a minor bug in the virus: It clears the bit that it wants to detect before it compares, so it will always fail to hide the size change. Infected files will appear 4KB longer, but the file seemingly does not have anything more than zeros appended to it.

Whenever an infected PE file is opened that is marked infected, the virus virtualizes the file content. In fact, it hides the changes so well that it is very difficult to see any changes at all. The virus assumes zeros for all the places where unknown data has been placed in the executable, such as the places where the decryptor of the virus would be stored in PE section slack areas. Otherwise, original content is returned for all previously modified fields of PE headers and section headers.

Evidently, if the bug were not in the code, the virus would be totally hidden from the eye. Is it? Yes and no. The virus code remains hidden from regular file _open() _read() functions. Consequently, when someone copies an infected file via such functions, the copy will first be "cleaned" from the virus.

W95/Sma, however, does not hook memory mapping at all. This means that a sequence of memory mapping APIs can reveal the infected file content, so the virus can easily be detected via these routines! This is definitely good news. It is unfortunate that most antivirus software is written using regular C functions for reasons of portability. Such functions, however, are all monitored properly by the virus, and as a result, some of the on-demand scanners can easily miss such infections in files.

Even more interesting is the payload of W95/Sma. The virus listens on a UDP port, and whatever datagram it receives will execute it in kernel mode. This allows the attacker to do practically anything he wants to do on the systemfor example, to burn the FLASH BIOS remotely.

The next section offers information about full-stealth techniques on DOS.

5.2.5.4 Full-Stealth Viruses

Resident, file infector viruses usually hook several DOS functions. (Table 5.4 lists Frodo's hook table.) Frodo hooks several functions that return the size of the file or the content of the file. The virus increments the infected files' date stamp by 100 years, which can be easily accessed later as a virus marker.

Table 5.4. The Function Hook Table of the Full-Stealth Frodo Virus
Sub Function in AH	Function Description
30h	Get DOS version
23h	Get File Size for FCB (File Control Block)
37h	Get/Set AVAILDEV flag
4Bh	Exec Load or Execute Program
3Ch	Create or Truncate File
3Dh	Open Existing File
3Eh	Close Existing File
0Fh	Open File using FCB
14h	Sequential Read from FCB
21h	Read Random Record from FCB file
27h	Random Block Read from FCB file
11h	Find First Matching File using FCB
12h	Find Next Matching File using FCB
4Eh	Find First Matching File
4Fh	Find Next Matching File
3Fh	Read from File
40h	Write to File
42h	Seek to File Position
57h	Get File Time/Date stamp
48h	Allocate Memory

The DOS DIR command only shows the year field in the directory, such as 1/09/89, so you can easily miss it if the file date stamp of an executable is "in the future" at, say, 2089. Frodo can easily manipulate the data that are returned to the caller based on the detection of the extra bit of information. Whenever an application such as a virus scanner or an integrity checker tries to check the size of the file or its content, false data are returned based on the virus marker. The virus will decrement the file size of each file that has a date stamp greater than or equal to the year 2044 by 4096 (the size of the virus). Evidently, this trick only works correctly before 2008. This is because the virus adds 100 years to the date, and the DOS date runs out at 2107, and thus the date wraps around, and Frodo starts to fail. Interestingly, the virus starts to fail even more after 2044 because it can no longer distinguish between infected and clean files anymore. (Jokingly, I can say, that over time, even viruses show signs of getting old, though, of course, this is not a real world concern of the author of the virus.) Otherwise, all files are believed to be infected, and their size is incorrectly reduced by 4,096 bytes, Frodo's file infection size.

5.2.5.5 Cluster and Sector -Level File Stealth

The Bulgarian Number_of_the_Beast virus uses a remarkably advanced stealth technique. The virus infects files, but it hides the changes in them by hooking INT 13h (the BIOS disk handler). It infects the fronts of the files using the classic parasitic technique and will not infect a file if the last cluster occupied by the host does not have at least 512 bytes of free space.

The idea is simple. It is based on the fact that most DOS disks will be formatted with cluster sizes of 2,048 bytes. As a result, this will be the minimum size occupied by a file, even if it is only a couple of bytes long. This means that a cluster, as well as a sector slack space (usually less than 512 bytes), exists in which no content is saved by the system. Number_of_the_Beast uses this space to store the overwritten part of the host program. Even when the virus is not active in memory, the size of the file remains the same as it was before infection because the file size is displayed according to the directory entries.

The virus can easily manipulate the content of files when they are accessed by simply presenting the original content at the front of the file. The virus uses many tricks and undocumented interfaces, making it rather DOS versiondependent.

Viruses less than 512 bytes long evidently could place themselves into individual sector slacks. Thus it is crucial to overwrite the virus-occupied disk areas with some pattern, such as zeros, when disinfecting the disk. Otherwise, antivirus programs can produce an in-memory ghost positive. A ghost positive is a special case when a computer virus code is detected in partial or complete form, but the virus code is not active. This can happen if a virus less than 512 bytes long infects a file in its sector slack, and the disinfector program restores the original file content but does not overwrite the virus body in the slack space. Because the BIOS will read the file according to its size in sectors, the virus code will be loaded in memory in an inactive form. This can trigger the attention of a memory scanner that looks for patterns of virus code in the system memory.

Figure 5.3 shows how a 1,536-byte program occupies disk space with 2,048 byte cluster size. The virus "recycles" the slack space to store original program content in it. After the infection, the DIR command will show 1,536 bytes as the file size, even when the virus is not active in memory. Thus the virus remains directory stealth, but the changes in the file are visible to antivirus programs and integrity checkers at that point.

Figure 5.3. A cluster-level stealth virus.

5.2.5.6 Hardware-Level Stealth

Finally, we arrive at a highly sophisticated technique, the hardware-level stealth, that was used in the Russian boot virus Strange⁷, written in 1993. Strange hooks INT 0Dh, which corresponds to IRQ 5. On an XT system, upon completion of the INT 13 call, the disk buffer is filled with data, and INT 0Dh is generated by the hardware. The Strange virus hooks INT 0Dh and is thereby able to intercept each disk read after the buffer has been filled with the data. In the INT 0Dh routine, the virus uses port 6 to get the pointer to the disk buffer and then checks whether the content of the buffer is infected. If Strange detects itself in the buffer, the virus overwrites the buffer with the original, clean sector content and completes INT 0Dh. In this way, the application that accessed the virus-infected sector will not be able to see any differences, even if the product manages to call into the BIOS INT 13h handler in an attempt to get around common virus handlers.

On AT systems, the virus cannot use INT 0Dh but uses INT 76h instead. In hooking INT 76h, it can monitor which sector is accessed using ports 1F3h1F6h. If the particular sector in which the virus code is stored is accessed, the virus rewrites the port contents with the descriptor of the original sector. This is because INT 76h is generated before the completion of INT 13h interrupt. Unbeknownst to the application that accesses the sector, its read access is forwarded on the fly to the clean, original sector.

Figure 5.4 shows the three steps that hardware viruses use to manipulate INT 76h.

Figure 5.4. A hardware-level stealth virus.

Note

The INT 76h trick does not work if Windows 3.0+ is running.

5.2.6. Disk Cache and System Buffer Infection

A very interesting attack strategy involves infection of files in the operating system's buffers or file system cache organized by the cache manager.

Such viruses are both anti-behavior blocking and antiheuristic. Behavior blockers typically get invoked in system events, such as opening an existing executable file to write and blocking the event to prevent virus infections. This is based on the idea that viruses write to files to replicate to them, so it seems logical that blocking write events to existing binaries would reduce the likelihood of virus infections.

The Darth_Vader virus was the first to inject its code into the DOS kernel itself in such a sophisticated way; the virus does not need to allocate extra memory for itself because it uses a memory cave of the DOS kernel instead. It then modifies the DOS kernel to get control from the operating system itself without ever modifying the interrupt vector table.

In this way, the virus can monitor the system buffer for executable content by checking whenever a new file gets fetched into the system buffer. The virus is able to modify the file in the system buffer itself! Cache and system buffer infections usually are implemented as a cavity infection type. The virus does not need to worry about extending the file to a larger size, which would introduce complications.

This infection strategy was also implemented on Windows 9x systems in the W95/Repus family of viruses by the virus writer who calls himself Super.

The Repus virus uses a unique trick to jump to kernel mode where it is able to call a VxD function to query the content of each file system's cache. If the virus finds a PE file in any of the caches, it writes its code into the header of the cached file and marks the page "dirty." If the file is copied from one location to another, the virus will inject its code into the cached buffers. Thus, the virus does not need to access any executable's content on the disk itself for write. It can simply wait until you copy a file to a new location; the "copy" of the file will be infected. (See the overly simplified drawing in Figure 5.5.)