THE ART OF COMPUTER VIRUS RESEARCH AND DEFENSE [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

THE ART OF COMPUTER VIRUS RESEARCH AND DEFENSE [Electronic resources] - نسخه متنی

Peter Szor

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید











  • 7.5. Polymorphic Viruses


    The next step in the level of difficulty is a polymorphic attack. Polymorphic viruses can mutate their decryptors to a high number of different instances that can take millions of different forms.

    7.5.1. The 1260 Virus


    The first known polymorphic virus, 1260, was written in the U.S. by Mark Washburn in 199016. This virus has many interesting techniques that were previously predicted by Fred Cohen. The virus uses two sliding keys to decrypt its body, but more importantly, it inserts junk instructions into its decryptor. These instructions are garbage in the code. They have no function other than altering the appearance of the decryptor.

    Virus scanners were challenged by 1260 because simple search strings could no longer be extracted from the code. Although 1260's decryptor is very simple, it can become shorter or longer according to the number of inserted junk instructions and random padding after the decryptor for up to 39 bytes of junk instructions. In addition, each group of instructions (prolog, decryption, and increments) within the decryptor can be permutated in any order. Thus the "skeleton" of the decryptor can change as well. Consider the example of an instance of a decryptor extracted from 1260 (see Listing 7.5).

    Listing 7.5. An Example Decryptor of 1260



    ; Group 1 Prolog Instructions
    inc si ; optional, variable junk
    mov ax,0E9B ; set key 1
    clc ; optional, variable junk
    mov di,012A ; offset of Start
    nop ; optional, variable junk
    mov cx,0571 ; this many bytes - key 2
    ; Group 2 Decryption Instructions
    Decrypt:
    xor [di],cx ; decrypt first word with key 2
    sub bx,dx ; optional, variable junk
    xor bx,cx ; optional, variable junk
    sub bx,ax ; optional, variable junk
    sub bx,cx ; optional, variable junk
    nop ; non-optional junk
    xor dx,cx ; optional, variable junk
    xor [di],ax ; decrypt first word with key 1
    ; Group 3 Decryption Instructions
    inc di ; next byte
    nop ; non-optional junk
    clc ; optional, variable junk
    inc ax ; slide key 1
    ; loop
    loop Decrypt ; until all bytes are decrypted slide key 2
    ; random padding up to 39 bytes
    Start:
    ; Encrypted/decrypted virus body

    In each group of instructions, up to five junk instructions are inserted (INC SI, CLC, NOP, and other do-nothing instructions) with no repetitions allowed in the junk. There are two NOP junk instructions that always appear.

    1260 does not have register replacement, but more complicated polymorphic attacks use that trick. Nonetheless, 1260 is an effective polymorphic engine that generates a high variety of decryptors.

    7.5.2. The Dark Avenger Mutation Engine (MtE)


    The next important development in the history of polymorphic viruses was MtE17, a mutation engine written by the Bulgarian Dark Avenger. The first version MtE was released during the summer of 1991, later followed by another version in early 1992. The idea of the mutation engine is based on modular development. For novice virus writers, it was difficult to write a polymorphic virus. However, more advanced virus writers came to their rescue. The MtE engine was released as an object that could be linked to any simple virus.

    The concept of MtE is to make a function call to the mutation engine function and pass control parameters in predefined registers. The engine takes care of building a polymorphic shell around the simple virus inside it.

    The parameters to the engine include the following:

    • A work segment

    • A pointer to the code to encrypt

    • Length of the virus body

    • Base of the decryptor

    • Entry-point address of the host

    • Target location of encrypted code

    • Size of decryptor (tiny, small, medium, or large)

    • Bit field of registers not to use


    In response, the MtE engine returns a polymorphic decryption routine with an encrypted virus body in the supplied buffer. (See Listing 7.6.)

    Listing 7.6. An Example Decryptor Generated by MtE



    mov bp,A16C ; This Block initializes BP
    ; to "Start"-delta
    mov cl,03 ; (delta is 0x0D2B in this example)
    ror bp,cl
    mov cx,bp
    mov bp,856E
    or bp,740F
    mov si,bp
    mov bp,3B92
    add bp,si
    xor bp,cx
    sub bp,B10C ; Huh ... finally BP is set, but remains an
    ; obfuscated pointer to encrypted body
    Decrypt:
    mov bx,[bp+0D2B] ; pick next word
    ; (first time at "Start")
    add bx,9D64 ; decrypt it
    xchg [bp+0D2B],bx ; put decrypted value to place
    mov bx,8F31 ; this block increments BP by 2
    sub bx,bp
    mov bp,8F33
    sub bp,bx ; and controls the length of decryption
    jnz Decrypt ; are all bytes decrypted?
    Start:
    ; encrypted/decrypted virus body

    This example of MtE illustrates how remarkably complicated this particular engine is. Can you guess how to detect it?

    It makes sense to look at a large set of samples first. The first time, it took me five days before I managed to write a reliable detector for it. MtE could produce some decryptor cases that appeared only in about 5% or less of all cases. However, the engine had a couple of minor limitations that were enough to detect the virus reliably using an instruction size disassembler and a state machine. In fact, there is only one constant byte in an MtE decryptor, the 0x75 (JNZ), which is followed by a negative offsetand even that is placed at a variable location (at the end of the decryptor, whose length is not constant).

    Note

    MtE does not have garbage instructions in the decryptor, as 1260 does. Therefore MtE attacks techniques that attempt to optimize decryptors to defeat polymorphism.

    MtE's impact on antivirus software was clear. Most AV engines had to go through a painful rearchitecting to introduce a virtual machine for the use of the scanning engine. As Frans Veldman used to say, "We simply let the virus do the dirty work."

    MtE was quickly followed by many similar engines, such as TPE (Trident Polymorphic Engine), written by Masouf Khafir in Holland in 1993.

    Today, hundreds of polymorphic engines are known. Most of these engines were only used to create a couple of viruses. After a polymorphic decryptor can be detected, using it becomes a disadvantage to virus writers because any new viruses are covered by the same detection. Such detections, however, usually come with the price of several false positives and false negatives. More reliable techniques detect and recognize the virus body itself.

    This opens up the possibility for virus writers to use the same polymorphic engine in many different viruses successfully unless such viruses are handled with heuristic or generic detection methods.

    7.5.3. 32-Bit Polymorphic Viruses


    W95/HPS and W95/Marburg18 were the first viruses to use real 32-bit polymorphic engines. These two viruses were authored by the infamous Spanish virus writer, GriYo, in 1998. He also created several highly polymorphic viruses earlier on DOS, such as the virus Implant19.

    Just like Implant's polymorphic engine, HPS's polymorphic engine is powerful and advanced. It supports subroutines using CALL/RET instructions and conditional jumps with nonzero displacement. The code of the polymorphic engine takes about half of the actual virus code, and there are random byte-based blocks inserted between the generated code chains of the decryptor. The full decryptor is built only during the first initialization phase, which makes the virus a slow polymorphic. This means that antivirus vendors cannot test their scanner's detection rate efficiently because the infected PC must be rebooted to force the virus to create a new decryptor.

    The decryptor consists of Intel 386based instructions. The virus body is encrypted and decrypted by different methods, including XOR/NOT and INC/DEC/SUB/ADD instructions with 8, 16, or 32 -bit keys, respectively. From a detection point of view, this drastically reduces the range of ideas. I am sad to say that the polymorphic engine was very well written, just like the rest of the virus. It was certainly not created by a beginner.

    Consider the following example of a decryptor, simplified for illustration. The polymorphic decryptor of the virus is placed after the variably encrypted virus body. The decryptor is split between small islands of code routines, which can appear in mixed order. In the example shown in Listing 7.7, the decryptor starts at the Decryptor_Start label, and the decryption continues until the code finally jumps to the decrypted virus body.

    Listing 7.7. An Illustration of a W95/Marburg Decryptor Instance



    Start:
    ; Encrypted/Decrypted Virus body is placed here
    Routine-6:
    dec esi ; decrement loop counter
    ret
    Routine-3:
    mov esi,439FE661h ; set loop counter in ESI
    ret
    Routine-4:
    xor byte ptr [edi],6F ; decrypt with a constant byte
    ret
    Routine-5:
    add edi,0001h ; point to next byte to decrypt
    ret
    Decryptor_Start:
    call Routine-1 ; set EDI to "Start"
    call Routine-3 ; set loop counter
    Decrypt:
    call Routine-4 ; decrypt
    call Routine-5 ; get next
    call Routine-6 ; decrement loop register
    cmp esi,439FD271h ; is everything decrypted?
    jnz Decrypt ; not yet, continue to decrypt
    jmp Start ; jump to decrypted start
    Routine-1:
    call Routine-2 ; Call to POP trick!
    Routine-2:
    pop edi
    sub edi,143Ah ; EDI points to "Start"
    ret

    The preceding decryptor is highly structural, with each differently functioning piece placed in its own routine. The result is millions of possible code patterns filled with random garbage instructions between the islands.

    Polymorphic viruses can create an endless number of new decryptors that use different encryption methods to encrypt the constant part (except their data areas) of the virus body.

    Some polymorphic viruses, such as W32/Coke, use multiple layers of encryption. Furthermore, variants of Coke also can infect Microsoft Word documents in a polymorphic way. Coke mutates its macro directly with the binary code of the virus instead of using macro code directly. Normally, polymorphic macro viruses are very slow because they need a lot of interpretation. Because Coke generates mutated macros with binary code, it does not suffer from slow-down issues and, as a result, is harder to notice. Consider the following example of Coke taken from two instances of mutated AutoClose() macros shown in Listing 7.8.

    Listing 7.8. An Example Snippet of Coke's Polymorphic Macro



    'BsbK
    Sub AuTOclOSE()
    oN ERROr REsuMe NeXT
    SHOWviSuAlBASIcEditOr = faLsE
    If nmnGG > WYff Then
    For XgfqLwDTT = 70 To 5
    JhGPTT = 64
    KjfLL = 34
    If qqSsKWW < vMmm Then
    For QpMM = 56 To 7
    If qtWQHU = PCYKWvQQ Then
    If lXYnNrr > mxTwjWW Then
    End If
    If FFnfrjj > GHgpE Then
    End If

    The second example is a little longer because of the junk. I have highlighted some of the essential instructions in these examples. Notice that even these are not presented in the same order whenever possible. For example, the preceding instance turns off the Visual Basic Editor, so you will no longer see it in Word's menus. However, in Listing 7.9, this happens later, after lots of inserted junk and other essential instructions.

    Listing 7.9. Another Snippet of Coke's Polymorphic Macro



    'fYJm
    Sub AUtOcLOse()
    oN ERRor REsUME NexT
    optIOns.saVenorMALPrOmpT = FAlsE
    DdXLwjjVlQxU$ = "TmDKK"
    NrCyxbahfPtt$ = "fnMM"
    If MKbyqtt > mHba Then
    If JluVV > mkpSS Then
    jMJFFXkTfgMM$ = "DmJcc"
    For VPQjTT = 42 To 4
    If PGNwygui = bMVrr Then
    dJTkQi = 07
    'wcHpsxllwuCC
    End If
    Next VPQjTT
    quYY = 83
    End If
    DsSS = 82
    bFVpp = 60
    End If
    tCQFv=1
    Rem kJPpjNNGQCVpjj
    LyBDXXXGnWW$ = "wPyTdle"
    If cnkCvCww > FupJLQSS Then
    VbBCCcxKWxww$ = "Ybrr"
    End If
    opTiONS.COnFIrmCOnvErsiOnS = faLSe
    Svye = 55
    PgHKfiVXuff$ = "rHKVMdd"
    ShOwVisUALbaSiCEdITOR = fALSe

    Newer polymorphic engines use an RDA-based decryptor that implements a brute-force attack against its constant but variably encrypted virus body in a multiencrypted manner. Manual analysis of such viruses might lead to great surprises. Often there are inefficiencies of randomness in such polymorphic engines, which can be used as a counterattack. Sometimes even a single wildcard string can do the magic of perfect detection.

    Most virus scanners, years ago, already had a code emulator capable of emulating 32-bit PE (portable executable) files. Other virus researchers only implemented dynamic decryption to deal with such viruses. That worked just as in the previous cases because the virus body was still constant under encryption. According to the various AV tests, some vendors were still sorry not to have support for difficult virus infection techniques.

    Virus writers used the combination of entry-point obscuring techniques with 32-bit polymorphism to make the scanner's job even more difficult. In addition, they tried to implement antiemulation techniques to challenge code emulators.

    Nevertheless, all polymorphic viruses carry a constant code virus body. With advanced methods, the body can be decrypted and identified. Consider Figure 7.3 as an illustration.

    Figure 7.3. Instances of encrypted and decrypted polymorphic virus bodies.


    • / 191