5.2. Memory-Resident VirusesA much more efficient class of computer viruses remains in memory after the initialization of virus code. Such viruses typically follow these steps:
This is the most typical pattern, but several other methods exist that do not require all of the preceding steps. On single-tasking operating systems such as DOS, only a single-user application can run at any one time; any other program code needs to make itself TSR (Terminate-and-Stay-Resident). DOS offers a variety of services in the form of interrupts to develop TSR code.A typical example of a TSR program on DOS is a clock application to display the time on the screen during the execution of any single program. Because all applications share a single "thread" of execution, any program can easily interfere with any other in more than one way. Indeed, even the code of DOS, some system data structures, device drivers, or interfaces can accidentally be changed by a buggy user application, which can lead to catastrophic system crashes and corruptions.Here is an anecdote about that. The first version of the Borland Quattro spreadsheet program for DOS was developed in 100% Assembly in Hungary. An interesting situation occurred during the development of the project. Sometimes during the execution of a loop, the control flow took the opposite direction than expected. The code did not explain why that would happen. It turned out that a loaded clock program on the system occasionally flipped the control flow because it modified the direction flag but in some cases forgot to set it back afterward. As a result, the not-intentionally-malicious clock program could easily do harm to the contents of spreadsheets and other programs. Of course, the bogus clock program was a TSR (Terminate-and-Stay-Resident).The point is that DOS applications are not separated or walled up from each other in any way. Malicious code can take advantage of this kind of system very easily. On standard DOS, the processor is used in a single mode, and therefore any program has the privilege to modify any other program's code in the physical memory, which is addressable up to 1MB long (with some computers capable of accessing an extra high memory area of 64KB above that). 5.2.1. Interrupt Handling and HookingDOS programs use DOS and BIOS interrupts for system services. In the past, on microcomputers, programmers typically transferred control to a BIOS-based entry point, so programmers needed to keep such entry points in mind. The interrupt vector table (IVT) simplifies the programmer's task on the IBM PC for several reasons. Using the IVT, programs can refer to functions by their interrupt number and service number. As a result, hard-coded addresses to services need not be compiled into the program code. Instead, the INT x instruction can be used to transfer control to a service via the IVT.Figure 5.1 illustrates how a typical boot virus such as Brain installs itself into the execution flow by hooking the BIOS disk handler. Figure 5.1. A typical boot virus hooks INT 13h.![]()
5.2.2. Hook Routines on INT 13h (Boot Viruses)An interrupt is typically used with a set of registers that define subfunction identifiers and pointers to data structures. For instance, the INT 13h takes its subfunctions in AH. To read the disk, someone must set the following registers:
The memory must be allocated first, and the disk needs to be reset beforehand. In fact, the diskettes are usually slow and do not spin quickly enough, requiring a couple of extra reads, with disk resets in between. It is nice to know that the hard disk numbers start at 80h (with bit 7 set).When the interrupt is executed, the return address is pushed to the stack. When the called interrupt handler (or chained handlers) returns with an IRET instruction, it will use the return address from the stack. The interrupt handler also can return with an RETF instruction.Boot viruses are naturally curious about the AH values passed into INT 13h. By presenting a new interrupt handler in the IVT, the virus code can easily monitor this value with a set of compare (CMP) instructions and take action according to the value.Typically, boot viruses first save the original INT 13h handler upon execution: And boot viruses generally allocate memory just below the top 640KB boundary by manipulating the BIOS DATA area at segment 40h and by changing the 40h:13h (0:[413h]) word value that holds the top available memory. When this value is changed, no memory allocation will be possible for any programs above a newly set limit, usually one or a couple of KBs less than the previous value.Next, the virus copies its code to the "allocated" block and hooks the INT 13h handler. It is interesting to note that viruses such as Stoned hook INT 13h before they relocate their code to the memory, which is set as the new handler. Obviously, the virus expects that no other disk reads can take place during boot time so that the code will not crash.Hooking a handler is therefore as simple as setting new values in the IVT. The new handler of Stoned is shown in Listing 5.1. Listing 5.1. A New Handler Installed by StonedObviously, it would be unethical to illustrate the virus with more code, but the previous code should give you a good idea of hooking in general. It also shows computer virus research from the perspective of code analysis. In the past, we typically commented code on printed paper, line by line. Eventually, the prints became far too long, and analysis of code appeared to be a 100-meter tournament, so to speak. Fortunately, great tools such as IDA (the Interactive Disassembler) came to the rescue (which will be discussed in Chapter 15, "Malicious Code Analysis Techniques"). 5.2.3. Hook Routines on INT 21h (File Viruses)File viruses typically hook INT 21h on DOS and it is commonly done using the INT 21h sub-functions 35h and 25h, Get and Set Interrupt vectors, respectively. Not all viruses, however, need to change INT 21h's vector in the IVT itself. An example of a virus that does not change the INT 21h vector is Frodo, written in Israel in 1989. Frodo does not hook the INT 21h vector using normal methods. Instead, the virus modifies the real entry point of INT 21h by placing a jump instruction to the entry point of the handler to its own handler.1, used full file stealth techniques a few months earlier than Frodo, but Frodo made the technique famous.)By intercepting INT 21h subfunctions, Frodo can hide file changes from DOS programs, even when they read from the file. The virus is sophisticated enough to show the original file content instead.Let's look at the INT 21h vector on a Frodo-infected DOS system, using DEBUG. We dump the INT 21h vector, which holds the value 19:40EB (segment:offset in memory). Even to a trained eye, this value is not suspicious at all and looks normal. This is because memory is typically filled from the lower segments toward the higher ones, and so "segment 19" might be in DOS itself, or even before it, pointing to a low memory segment. Next, we take a look at the handler with the unassembly command from the address we found in the IVT (see Listing 5.2). Listing 5.2. The Jump (JMP) Instruction to Frodo's Hook RoutineThe preceding code seems strange. Although we can see usual CMP (compare) instructions, the jump instruction at the entry point takes the control flow to 9E20:02D5. This code is patched there by the virus itself to take control in a sophisticated manner.Finally, we can take a look at a fraction of the entry-point code of Frodo. Another unassembly command reveals the virus code in memory, as shown in Listing 5.3. Listing 5.3. The Hook Routine Entry of FrodoFrodo is very tricky, however. Instead of using a simple switch statement using compare (CMP) instructions, the virus uses a table at offset 290 of the virus segment. The virus transfers control to the subfunctions of INT 21h according to the table. The bold characters in the next table are DOS subfunctions, followed by the offset where the subhandlers are located. Let's dump the memory from virus segment:290. 2.Table 5.2 shows some of the early, in-the-wild viruses with their common interrupt hook distributions and infection characteristics.
5.2.4. Common Memory Installation Techniques Under DOSIn this section you will learn about the most common techniques computer viruses use to install themselves in memory of a DOS machine. Because DOS does not have any memory protection, viruses can easily manipulate any area of memory. Fortunately on a DOS system, viruses cannot effectively hide themselves in memory. This is because physical memory is continuous, and short pieces of virus code can easily be found. This is why memory stealth viruses are unknown on DOS, although a number of techniques exist that attempt to install virus code in unusual places of memory to confuse antivirus products that look for virus patterns only on the code path of certain interrupt handlers or certain areas of memory (up to 640KB, but not beyond that limit).3. 5.2.4.1 Self-Detection Techniques in MemoryA common technique of self-recognition in memory is based on the use of "Are you there?" calls. Boot viruses typically do not use this technique because they only load once during the booting of the system. However, other viruses that infect files need to hook the system only once, so the virus hooks an interrupt or file system and returns specific output for special input registers. The newly executed copies of the virus can check if a previous copy is installed by calling this routine. The memory resident copy answers the call, "Yes, I am here. Do not bother to install again." Table 5.3 contains some examples of this from DOS systems.
5.2.5. Stealth VirusesStealth viruses always intercept a single function or set of functions in such a way that the caller of the function will receive manipulated data from the hook routine of the virus. Therefore, computer virus researchers only call a virus "stealth" if the virus is active in memory and manipulates the data being returned.Virus writers always attempt to challenge users, virus researchers, and virus scanners. Some techniques, such as antiheuristics and antiemulation, were only invented by virus writers when scanners started to get stronger; however, stealth viruses appeared very quickly.In fact, one of the first-known viruses on the PC, Brain (a boot virus), was already stealth. Brain showed the original boot sector whenever an infected sector was accessed and the virus was active in memory, hooking the disk interrupt handling. This was in the golden days when Alan Solomon (author of one of the most widely used virus-scanning engines) was challenged to figure out what exactly was going on in Brain-infected systems.The stealth technique also quickly appeared in DOS file infector viruses. This method was a sure way for a virus to go unnoticed for a relatively long period of time. In fact, in the DOS days, users would remember sizes of system files in an attempt to apply their own integrity checking. By knowing the original size of a file such as COMMAND.COM, the command interpreter was halfway to success in finding an on-going infection.According to how difficult it was to find a virus in files and what kind of method was used, virus researchers started to describe the techniques differently. The following sections depict the most common stealth techniques: semistealth, read stealth, full stealth, cluster and sector -level stealth, and hardware-level stealth. 5.2.5.1 Semistealth (Directory Stealth)We call a virus semistealth if it hides the change of file size but the changed content of the infected objects remains visible via regular file access. The first known semistealth virus, called Eddie-2, was written in Bulgarian virus factories4.The semistealth technique requires the following basic attack strategy:
5.2.5.1.1 VxDCallINT21_Dispatch HandlerThis technique was introduced by W95/HPS5. W95/HPS monitors 714Eh, 714Fh LFN (long file name) FindFirst/FindNext functions, which is mandatory under Windows 95 from the virus's point of view. The actual implementation of the stealth handler is unique. The virus patches the return address of FindFirst/FindNext functions on the fly on the stack to its own handler. This handler checks that the actual program size is divisible by 101 without a remainder, and if so, the virus opens the program with an extended open LFN function and then reads the virus size from the last four bytes of the infected program and subtracts this value as a 32-bit variable from the original return value of FindFirst/FindNext on the stack. Finally, it returns to the caller of the function.In this way, the virus can hide the file size differences from most applications while the size of the virus body should not be a constant value. 5.2.5.1.2 Hook on Import Address Table (IAT)This method was introduced by W32/Cabanas and is likely to be reused in new Win32 viruses. The same technique can work under most major Win32 platforms by using the same algorithm. The hook function is based on the manipulation of the IAT. Because the host program holds the addresses of all imported APIs in its .idata section, all the virus has to do is replace those addresses to point to its own API handlers.First, Cabanas searches the IAT for all the possible function names it wants to hook: GetProcAddress, GetFileAttributesA, GetFileAttributesW, MoveFileExA, MoveFileExW, _lopen, CopyFileA, CopyFileW, OpenFile, MoveFileA, MoveFileW, CreateProcessA, CreateProcessW, CreateFileA, CreateFileW, FindClose, FindFirstFileA, FindFirstFileW, FindNextFileA, FindNextFileW, SetFileAttrA, or SetFileAttrW.Whenever it finds one, it saves the original address to its own jump table and replaces the .idata section's DWORDs (which holds the original address of the API) with a pointer to its own API handlers.Consider dump, shown in Listing 5.4, to illustrate hooked GetProcAddress(), FindFirstFileA() functions. Listing 5.4. Hooking the IATGetProcAddress is used by many Win32 applications to make dynamical, instead of import address table-based ("static") calls. When the host application calls GetProcAddress, the new handler of the virus first calls the original GetProcAddress to get the address of the requested API. Afterward, it checks whether the function is a KERNEL32 API and whether it is one of the APIs that the virus needs to hook. If the virus wants to hook the API, it returns a new API address that will point into the hook table (NewJMPTable). Thus the host application will also get an address to the new handler.W32/Cabanas is a directory stealth virus: during FindFirstFileA, FindFirstFileW, FindNextFileA, and FindNextFileW, the virus checks for already-infected programs. If a program is not infected, the virus will infect it; otherwise, it hides the file size difference by returning the original size of the host program. Because the cmd.exe (Command Interpreter of Windows NT) uses the preceding APIs during the DIR command, every uninfected file will be infected (if the cmd.exe was infected previously by W32/Cabanas). 5.2.5.2 Read StealthThe read stealth technique is an attack strategy that is a bit more advanced. Read stealth shows the original content of an infected object using content simulation, usually by intercepting seek and/or read functions only.In fact, the first stealth viruses, such as Brain, use the read stealth technique. The virus simply intercepts any access to the first sector of diskettes. When the first sector is accessed and it is not infected, the virus infects it and stores the original sector elsewhere on the diskette. When an application attempts to read the infected DBR, the virus reads the originally stored DBR sector and returns that to the caller. As a result, programs accessing "the boot sector" believe that the fake sector is the true one. See Figure 5.2 for an illustration. Figure 5.2. A read stealth computer virus.![]() 5.2.5.3 Read Stealth on WindowsYou might wonder what happened to read stealth viruses on Windows systems. Do you happen to know any Windows users who remember the size of any Windows application? Who would pay attention to that in these days, when typical applications are so huge that they hardly fit on a diskette? This appears to be the primary reason why there have been only a few attempts so far to develop stealth viruses on 32-bit Windows systems. The first read stealth virus on Windows 9x, W95/Sma6, was discovered in June of 2002, about seven years after the discovery of the first 32-bit virus on Windows. Evidently, development of stealth techniques on Windows did not mature as quickly as on DOS for several reasons.When I initially attempted to replicate the Sma virus on my test system, it was only a few minutes before I figured that I was "in the Matrix" of the virus. The "Matrix" had me, so seemingly I could not replicate it.First I believed that I had replicated the virus. I knew that because the size of my goat files changed on the hard disk of my replication machine. I then copied the infected files to a diskette to move them over to my virus research machine. Surprisingly enough, I copied clean files. I repeated the procedure twice before I started to suspect that something was just not right with W95/Sma.I used my Windows Commander tool on the infected machine to look into the file. Sure enough, there was nothing new in the file. In fact, the file was bigger, but nothing seemed to be appended to it. Then I accessed the file on the diskette one more time. Suddenly, the size of the file changed on the diskette also. I quickly inserted the diskette into my virus research system and saw that W95/Sma was in there. Gotcha!The virus attempts to set the second field of the infected PE files to 4 to hide its size in such specially marked, infected files. However, there is a minor bug in the virus: It clears the bit that it wants to detect before it compares, so it will always fail to hide the size change. Infected files will appear 4KB longer, but the file seemingly does not have anything more than zeros appended to it.Whenever an infected PE file is opened that is marked infected, the virus virtualizes the file content. In fact, it hides the changes so well that it is very difficult to see any changes at all. The virus assumes zeros for all the places where unknown data has been placed in the executable, such as the places where the decryptor of the virus would be stored in PE section slack areas. Otherwise, original content is returned for all previously modified fields of PE headers and section headers.Evidently, if the bug were not in the code, the virus would be totally hidden from the eye. Is it? Yes and no. The virus code remains hidden from regular file _open() _read() functions. Consequently, when someone copies an infected file via such functions, the copy will first be "cleaned" from the virus.W95/Sma, however, does not hook memory mapping at all. This means that a sequence of memory mapping APIs can reveal the infected file content, so the virus can easily be detected via these routines! This is definitely good news. It is unfortunate that most antivirus software is written using regular C functions for reasons of portability. Such functions, however, are all monitored properly by the virus, and as a result, some of the on-demand scanners can easily miss such infections in files.Even more interesting is the payload of W95/Sma. The virus listens on a UDP port, and whatever datagram it receives will execute it in kernel mode. This allows the attacker to do practically anything he wants to do on the systemfor example, to burn the FLASH BIOS remotely.The next section offers information about full-stealth techniques on DOS. 5.2.5.4 Full-Stealth VirusesResident, file infector viruses usually hook several DOS functions. (Table 5.4 lists Frodo's hook table.) Frodo hooks several functions that return the size of the file or the content of the file. The virus increments the infected files' date stamp by 100 years, which can be easily accessed later as a virus marker.
5.2.5.5 Cluster and Sector -Level File StealthThe Bulgarian Number_of_the_Beast virus uses a remarkably advanced stealth technique. The virus infects files, but it hides the changes in them by hooking INT 13h (the BIOS disk handler). It infects the fronts of the files using the classic parasitic technique and will not infect a file if the last cluster occupied by the host does not have at least 512 bytes of free space.The idea is simple. It is based on the fact that most DOS disks will be formatted with cluster sizes of 2,048 bytes. As a result, this will be the minimum size occupied by a file, even if it is only a couple of bytes long. This means that a cluster, as well as a sector slack space (usually less than 512 bytes), exists in which no content is saved by the system. Number_of_the_Beast uses this space to store the overwritten part of the host program. Even when the virus is not active in memory, the size of the file remains the same as it was before infection because the file size is displayed according to the directory entries.The virus can easily manipulate the content of files when they are accessed by simply presenting the original content at the front of the file. The virus uses many tricks and undocumented interfaces, making it rather DOS versiondependent.Viruses less than 512 bytes long evidently could place themselves into individual sector slacks. Thus it is crucial to overwrite the virus-occupied disk areas with some pattern, such as zeros, when disinfecting the disk. Otherwise, antivirus programs can produce an in-memory ghost positive. A ghost positive is a special case when a computer virus code is detected in partial or complete form, but the virus code is not active. This can happen if a virus less than 512 bytes long infects a file in its sector slack, and the disinfector program restores the original file content but does not overwrite the virus body in the slack space. Because the BIOS will read the file according to its size in sectors, the virus code will be loaded in memory in an inactive form. This can trigger the attention of a memory scanner that looks for patterns of virus code in the system memory.Figure 5.3 shows how a 1,536-byte program occupies disk space with 2,048 byte cluster size. The virus "recycles" the slack space to store original program content in it. After the infection, the DIR command will show 1,536 bytes as the file size, even when the virus is not active in memory. Thus the virus remains directory stealth, but the changes in the file are visible to antivirus programs and integrity checkers at that point. Figure 5.3. A cluster-level stealth virus.![]() 5.2.5.6 Hardware-Level StealthFinally, we arrive at a highly sophisticated technique, the hardware-level stealth, that was used in the Russian boot virus Strange7, written in 1993. Strange hooks INT 0Dh, which corresponds to IRQ 5. On an XT system, upon completion of the INT 13 call, the disk buffer is filled with data, and INT 0Dh is generated by the hardware. The Strange virus hooks INT 0Dh and is thereby able to intercept each disk read after the buffer has been filled with the data. In the INT 0Dh routine, the virus uses port 6 to get the pointer to the disk buffer and then checks whether the content of the buffer is infected. If Strange detects itself in the buffer, the virus overwrites the buffer with the original, clean sector content and completes INT 0Dh. In this way, the application that accessed the virus-infected sector will not be able to see any differences, even if the product manages to call into the BIOS INT 13h handler in an attempt to get around common virus handlers.On AT systems, the virus cannot use INT 0Dh but uses INT 76h instead. In hooking INT 76h, it can monitor which sector is accessed using ports 1F3h1F6h. If the particular sector in which the virus code is stored is accessed, the virus rewrites the port contents with the descriptor of the original sector. This is because INT 76h is generated before the completion of INT 13h interrupt. Unbeknownst to the application that accesses the sector, its read access is forwarded on the fly to the clean, original sector.Figure 5.4 shows the three steps that hardware viruses use to manipulate INT 76h. Figure 5.4. A hardware-level stealth virus.![]() 5.2.6. Disk Cache and System Buffer InfectionA very interesting attack strategy involves infection of files in the operating system's buffers or file system cache organized by the cache manager.Such viruses are both anti-behavior blocking and antiheuristic. Behavior blockers typically get invoked in system events, such as opening an existing executable file to write and blocking the event to prevent virus infections. This is based on the idea that viruses write to files to replicate to them, so it seems logical that blocking write events to existing binaries would reduce the likelihood of virus infections.The Darth_Vader virus was the first to inject its code into the DOS kernel itself in such a sophisticated way; the virus does not need to allocate extra memory for itself because it uses a memory cave of the DOS kernel instead. It then modifies the DOS kernel to get control from the operating system itself without ever modifying the interrupt vector table.In this way, the virus can monitor the system buffer for executable content by checking whenever a new file gets fetched into the system buffer. The virus is able to modify the file in the system buffer itself! Cache and system buffer infections usually are implemented as a cavity infection type. The virus does not need to worry about extending the file to a larger size, which would introduce complications.This infection strategy was also implemented on Windows 9x systems in the W95/Repus family of viruses by the virus writer who calls himself Super.The Repus virus uses a unique trick to jump to kernel mode where it is able to call a VxD function to query the content of each file system's cache. If the virus finds a PE file in any of the caches, it writes its code into the header of the cached file and marks the page "dirty." If the file is copied from one location to another, the virus will inject its code into the cached buffers. Thus, the virus does not need to access any executable's content on the disk itself for write. It can simply wait until you copy a file to a new location; the "copy" of the file will be infected. (See the overly simplified drawing in Figure 5.5.) Figure 5.5. A disk cache infector virus.![]() ![]() |