���������������������������������������������������������������������������
                          Advanced bait detection

                               by CoKe/VLAD
���������������������������������������������������������������������������


                               Introduction:


Bait files are the files the AV create to get them infected by a virus,
to check out his behaviour, his polymorphism, and last but not least to
debug it. Usually these files are pretty small, and only contain a few
instructions (i.e. Printing a message). As soon as they start infecting
hundreds (maybe thousands) of baits, they'll find out everything about
the virus. So to give them a real hard time doing their jobs, every
virus should in one way or another be able to recognize those files.

There are many possible ways of detecting bait files; most of them are
covered by the article 'Resist!' which you can find in this issue as well.
Since I suppose you don't want to read all that stuff twice, I will just
leave it out of my article.

In this article I will cover another way of detecting bait files:
code-analysis, bait-heuristics to say so, and after checking out tons
of executables, here's what I found out:


���������������������������������������������������������������������������


                             1 - COM Baits
                             �������������

Detecting COM baitfiles is a bit difficult, since you really have to
check out the code (in EXE's you can analyse the header aswell, but
the informations below are still true for them). Since bait files are
usually pretty small, best way to do it is to check the first 2048
(or even 4096) bytes for some specific constructions. Below are
listed a few things that mostly happen in bait files. Of course it's
no 100% detection, but if some of these criteria are fulfilled together,
it's better to make your virus ignore that file.


- Check for massive use of INC/DEC structures (there's no point in
  increasing a register right after decreasing it :))

i.e. : INC AX
       DEC AX
       INC BX
       DEC BX
       etc.


- Check for a huge amount of NOP's (especially after jumps), since in
  "normal" COM files, there are few or no NOP's at all.

i.e. : JMP 110
       NOP
       NOP
       NOP
       NOP
       NOP
       etc.

 Also check for huge amounts of 00h bytes after jumps.


- Check if the first instructions are a
       mov ah, 4Ch
       int 21h sequence (same for ret)
  or if the first instruction jumps to such a routine.


- Check for zero jumps (E90000h, EB00h etc.); again, in "normal" programs,
  usually there are no zero jumps.


- Check for jumps to jumps. Of course this can happen in any COM file, but
  if you have a jump to a jump along with tons of NOP's, you can be sure
  there are wierd things going on.


- Also check for calls to a ret, as this is meaningless code too.


As I said, on their own, none of these criteria are useful. But combined
they give kinda a bait-heuristic, and it depends on you, how you
consider the matches. A file that matches even 3 of these criteria is
suspicious in my eyes.


���������������������������������������������������������������������������

                             2 - EXE Baits
                             �������������


As for EXE baits, the above is still true. In addition to all that, we
have the EXE-header that can help us.

- Most bait files don't have a stack!
  To check this out, just compare CS with SS and SP with IP. If they both
  match there is no stack, which is quite uncommon for EXE's.

                              �����������

                            EXE header structure

���������������������������������������������������������������������������

                  Offset    Description         Size
                  00        Signature           1 word    'MZ' or 'ZM'
                  02        Last Page Size      1 word
                  04        File Pages          1 word
                  06        Items               1 word
                 *08        Header Paras        1 word
                  0A        MinAlloc            1 word
                  0C        MaxAlloc            1 word
                  0E        SS                  1 word<������� Compare CS
                  10        SP                  1 word<������� to SS and
                  12        Negative checksum   1 word         IP to SP!
                 *14        IP                  1 word<�������
                 *16        CS                  1 word<�������
                  18        Reloc table offset  1 word

   ( [*] These are needed to calculate the entry point of the EXE. )
���������������������������������������������������������������������������

        An EXE file with no stack is definitely to be distrusted!

���������������������������������������������������������������������������


 For the further code-analysis, you gotta analyse the code at the
 entry point, and _NOT_ the code after the header! This is a bit more
 complicated than for the COM's, but once you got the entry point
 (file offset of the code at CS/IP), it's the same as for COM files.
 Unfortunately you won't find the entry point anywhere in the header.
 To get it, you must use the following formula:


                 ((CS + Header Para's) * 16) + IP

                 (See EXE header structure above)


Example:
               mov ax, word ptr cs:[exeheader + 16h]   ; This loads the CS
                                                       ; to AX
               add ax, word ptr cs:[exeheader + 08h]   ; Add the header
                                                       ; para's to CS
               mov dx, 10h
               mul dx                                  ; Multiply by 16
               add ax, word ptr cs:[exeheader + 14h]   ; Add the IP


This leaves the entry point in AX. Now all you need to do is to set the
filepointer to the calculated address, and there you go...


                               Final words:


By respecting the ideas mentioned in 'Resist!', and adopting some of these
techniques, you can be pretty sure to give the AV a hard time, unless you
want your latest virus in any AV software 2 days after it was released! :)


���������������������������������������������������������������������������
���������������������������������������������������������������������������