7.3. Encrypted VirusesFrom the very early days, virus writers tried to implement virus code evolution. One of the easiest ways to hide the functionality of the virus code was encryption. The first known virus that implemented encryption was Cascade on DOS4. The virus starts with a constant decryptor, which is followed by the encrypted virus body. Consider the example extracted from Cascade.1701 shown in Listing 7.1. Listing 7.1. The Decryptor of the Cascade Viruslea si, Start ; position to decrypt (dynamically set) mov sp, 0682 ; length of encrypted body (1666 bytes) Decrypt: xor [si],si ; decryption key/counter 1 xor [si],sp ; decryption key/counter 2 inc si ; increment one counter dec sp ; decrement the other jnz Decrypt ; loop until all bytes are decrypted Start: ; Encrypted/Decrypted Virus Body Note that this decryptor has antidebug features because the SP (stack pointer) register is used as one of the decryption keys. The direction of the decryption loop is always forward; the SI register is incremented by one. Because the SI register initially points to the start of the encrypted virus body, its initial value depends on the relative position of the virus body in the file. Cascade appends itself to the files, so SI will result in the same value if two host programs have equivalent sizes. However, the SI (decryption key 1) is changed if the host programs have different sizes. The SP register is a simple counter for the number of bytes to decrypt. Note that the decryption is going forward with word (double-byte) key length. The decryption position, however, is moved forward by one byte each time. This complicates the decryption loop, but it does not change its reversibility. Note that simple XOR is very practical for viruses because XORing with the same value twice results in the initial value. Consider encrypting letter P (0x50) with the key 0x99. You see, 0x50 XOR 0x99 is 0xC9, and 0xC9 XOR 0x99 will return to 0x50. This is why virus writers like simple encryption so muchthey are lazy! They can avoid implementing two different algorithms, one for the encryption and one for the decryption. Cryptographically speaking, such encryption is weak, though early antivirus programs had little choice but to pick a detection string from the decryptor itself. This led to a number of problems, however. Several different viruses might have the same decryptor, but they might have completely different functionalities. By detecting the virus based on its decryptor, the product is unable to identify the variant or the virus itself. More importantly, nonviruses, such as antidebug wrappers, might have a similar decryptor in front of their code. As a result, the virus that uses the same code to decrypt itself will confuse them. Such a simple code evolution method also appeared in 32-bit Windows viruses very early. W95/Mad and W95/Zombie use the same technique as Cascade. The only difference is the 32-bit implementation. Consider the decryptor from the top of W95/Mad.2736, shown in Listing 7.2. Listing 7.2. The Decryptor of the W95/Mad.2736 Virusmov edi,00403045h ; Set EDI to Start add edi,ebp ; Adjust according to base mov ecx,0A6Bh ; length of encrypted virus body mov al,[key] ; pick the key Decrypt: xor [edi],al ; decrypt body inc edi ; increment counter position loop Decrypt ; until all bytes are decrypted jmp Start ; Jump to Start (jump over some data) DB key 86 ; variable one byte key Start: ; encrypted/decrypted virus body In fact, this is an even simpler implementation of the simple XOR method. Detection of such viruses is still possible without trying to decrypt the actual virus body. In most cases, the code pattern of the decryptor of these viruses is unique enough for detection. Obviously, such detection is not exact, but the repair code can decrypt the encrypted virus body and easily deal with minor variants. The attacker can implement some interesting strategies to make encryption and decryption more complicated, further confusing the antivirus program's detection and repair routines:
Because the virus decryption is not linear, the virus body is not decrypted one byte after another. This easily might confuse a junior virus analyst because in some cases, the virus body might not look encrypted at all. Consequently, if a detection string is picked from such a sample, the virus detection will be partial. This technique easily can confuse even advanced detection techniques that use an emulator. Although in normal cases the emulation can continue until linear detection is detected, such as consecutive byte changes in the memory of a virtual machine used by the scanner, a nonlinear algorithm will force the emulation to continue until a hard-to-guess minimum limit. A variant of the W32/Chiton ("Efish") virus uses a similar approach to Fono's, but Chiton makes sure it always replaces each byte of the virus body with another value using a complete substitution table. In addition, Chiton uses multiple values to correspond to each byte in the code, significantly complicating the decryption.7
Note Metamorphic viruses such as Simile circumvent this disadvantage because the code that allocates memory is made variable without providing the ability to pick a search string. The preceding techniques work very effectively when combined with variable decryptors that keep changing in new generations of the virus. Oligomorphic and polymorphic decryption are discussed in the following sections. |