THE ART OF COMPUTER VIRUS RESEARCH AND DEFENSE [Electronic resources] نسخه متنی

7.3. Encrypted Viruses

From the very early days, virus writers tried to implement virus code evolution. One of the easiest ways to hide the functionality of the virus code was encryption. The first known virus that implemented encryption was Cascade on DOS⁴. The virus starts with a constant decryptor, which is followed by the encrypted virus body. Consider the example extracted from Cascade.1701 shown in Listing 7.1.

Listing 7.1. The Decryptor of the Cascade Virus


lea     si, Start  ; position to decrypt (dynamically set)
mov     sp, 0682   ; length of encrypted body (1666 bytes)
Decrypt:
xor     [si],si    ; decryption key/counter 1
xor     [si],sp    ; decryption key/counter 2
inc     si         ; increment one counter
dec     sp         ; decrement the other
jnz     Decrypt    ; loop until all bytes are decrypted
Start:             ; Encrypted/Decrypted Virus Body

Note that this decryptor has antidebug features because the SP (stack pointer) register is used as one of the decryption keys. The direction of the decryption loop is always forward; the SI register is incremented by one.

Because the SI register initially points to the start of the encrypted virus body, its initial value depends on the relative position of the virus body in the file. Cascade appends itself to the files, so SI will result in the same value if two host programs have equivalent sizes. However, the SI (decryption key 1) is changed if the host programs have different sizes. The SP register is a simple counter for the number of bytes to decrypt. Note that the decryption is going forward with word (double-byte) key length. The decryption position, however, is moved forward by one byte each time. This complicates the decryption loop, but it does not change its reversibility. Note that simple XOR is very practical for viruses because XORing with the same value twice results in the initial value.

Consider encrypting letter P (0x50) with the key 0x99. You see, 0x50 XOR 0x99 is 0xC9, and 0xC9 XOR 0x99 will return to 0x50. This is why virus writers like simple encryption so muchthey are lazy! They can avoid implementing two different algorithms, one for the encryption and one for the decryption.

Cryptographically speaking, such encryption is weak, though early antivirus programs had little choice but to pick a detection string from the decryptor itself. This led to a number of problems, however. Several different viruses might have the same decryptor, but they might have completely different functionalities. By detecting the virus based on its decryptor, the product is unable to identify the variant or the virus itself. More importantly, nonviruses, such as antidebug wrappers, might have a similar decryptor in front of their code. As a result, the virus that uses the same code to decrypt itself will confuse them.

Such a simple code evolution method also appeared in 32-bit Windows viruses very early. W95/Mad and W95/Zombie use the same technique as Cascade. The only difference is the 32-bit implementation. Consider the decryptor from the top of W95/Mad.2736, shown in Listing 7.2.

Listing 7.2. The Decryptor of the W95/Mad.2736 Virus


mov     edi,00403045h  ;  Set EDI to Start
add     edi,ebp      ; Adjust according to base
mov     ecx,0A6Bh    ; length of encrypted virus body
mov     al,[key]     ; pick the key
Decrypt:
xor     [edi],al     ; decrypt body
inc     edi          ; increment counter position
loop    Decrypt      ; until all bytes are decrypted
jmp     Start        ; Jump to Start (jump over some data)
DB    key     86           ; variable one byte key
Start:                     ; encrypted/decrypted virus body

In fact, this is an even simpler implementation of the simple XOR method. Detection of such viruses is still possible without trying to decrypt the actual virus body. In most cases, the code pattern of the decryptor of these viruses is unique enough for detection. Obviously, such detection is not exact, but the repair code can decrypt the encrypted virus body and easily deal with minor variants.

The attacker can implement some interesting strategies to make encryption and decryption more complicated, further confusing the antivirus program's detection and repair routines:

The direction of the loop can change: forward and backward loops are supported (see all cases in Figure 7.1).
Figure 7.1. Decryption loop examples.
Multiple layers of encryption are used. The first decryptor decrypts the second one, the second decrypts the third, and so on (see ⁵ by Demon Emperor, W32/Harrier⁶ by TechnoRat, { W32, W97M} /Coke by Vecna, and W32/Zelly by ValleZ are examples of viruses that use this method.
Several encryption loops take place one after another, with randomly selected directionsforward and backward loops. This technique scrambles the code the most (see Figure 7.1c.).
There is only one decryption loop, but it uses more than two keys to decrypt each encrypted piece of information on the top of the others. Depending on the implementation of the decryptor, such viruses can be much more difficult to detect. The size of the key especially mattersthe bigger the key size (8, 16, 32 -bit, or more), the longer the brute-force decryption might take if the keys cannot be extracted easily.
The start of decryptor is obfuscated. Some random bytes are padded between the decryptor and the encrypted body and/or the encrypted body and the end of the file.
Nonlinear decryption is used. Some viruses, such as W95/Fono, use a simple nonlinear algorithm with a key table. The virus encryption is based on a substitution table. For instance, the virus might decide to swap the letters A and Z, the letters P and L, and so on. Thus the word APPLE would look like ZLLPE after such encryption.

Because the virus decryption is not linear, the virus body is not decrypted one byte after another. This easily might confuse a junior virus analyst because in some cases, the virus body might not look encrypted at all. Consequently, if a detection string is picked from such a sample, the virus detection will be partial. This technique easily can confuse even advanced detection techniques that use an emulator. Although in normal cases the emulation can continue until linear detection is detected, such as consecutive byte changes in the memory of a virtual machine used by the scanner, a nonlinear algorithm will force the emulation to continue until a hard-to-guess minimum limit.

A variant of the W32/Chiton ("Efish") virus uses a similar approach to Fono's, but Chiton makes sure it always replaces each byte of the virus body with another value using a complete substitution table. In addition, Chiton uses multiple values to correspond to each byte in the code, significantly complicating the decryption.⁷

The attacker can decide not to store the key for encryption anywhere in the virus. Instead, the virus uses brute force to decrypt itself, attempting to recover the encryption keys on its own. Viruses like this are much harder to detect and said to use the RDA (random decryption algorithm) technique. The RDA.Fighter virus is an example that uses this method.⁸ Because the virus carries the key for the decryption, the encryption cannot be considered strong, but the repair of such viruses is painful because the antivirus needs to reimplement the encryption algorithm to deal with it. In addition, the second decryption layer of IDEA virus⁹ uses RDA.¹⁰ and W95/Silcer are examples of this method. These viruses force the Windows Loader to relocate the infected program images when they are loaded to memory. The act of relocating the image is responsible for decrypting the virus body because the virus injects special relocations for the purpose of decryption. The image base of the executable functions as the encryption key.¹¹. Virus researchers cannot easily describe the payload of such virus unless the cipher in the virus is weak. Dmitry Gryaznov managed to reduce the key size necessary to attack the cipher in Cheeba to only 2,150,400 possible keys by using frequency cryptanalysis of the encrypted virus body, assuming that the code under the encryption was written in a similar style as the rest of the virus code¹². This yielded the result, and the magic filename, "users.bbs" was found. This filename belonged to a popular bulletin board software. It is expected that more, so-called "clueless agents"¹³ will appear as computer viruses to disallow the defender to gain knowledge about the intentions of the attacker.¹⁴ pseudo-number generator used by W32/Chiton and W32/Beagle.
The attacker can select several locations to decrypt the encrypted content. The most common methods are shown in Figure 7.2.
Figure 7.2. Possible places of decryption. A) The decryptor decrypts the data at the location of the encrypted virus body. This method is the most common; however, the encrypted data must be writeable in memory, which depends on the actual operating system. B) The decryptor reads the encrypted content and builds the decrypted virus body on the stack. This is very practical for the attacker. The encrypted data does not need to be writeable. C) The virus allocates memory for the decrypted code and data. This can be a serious disadvantage for the attacker because nonencrypted code needs to allocate memory firstbefore the decryptor.

Note

Metamorphic viruses such as Simile circumvent this disadvantage because the code that allocates memory is made variable without providing the ability to pick a search string.

The preceding techniques work very effectively when combined with variable decryptors that keep changing in new generations of the virus. Oligomorphic and polymorphic decryption are discussed in the following sections.

THE ART OF COMPUTER VIRUS RESEARCH AND DEFENSE [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی