40Hex Number 12 Volume 3 Issue 3                                     File 003

               Self Checking Executable Files
                                       Demogorgon  Phalcon/Skism


    In this article I will explain a method that will allow .COM files
to be immune to simple viruses.  In order to infect a .COM file, a virus
must change several bytes at the beginning of the code.  Before the
virus returns control to the original program, it will 'disinfect' it
into memory, so that the program runs as it did before infection.  This
disinfection process is crucial, because it means that the image on the
disk will not be the same as the memory image of the program.  This
article describes a method by which a .COM file can perform a self-check
by reading its disk image and comparing it to its memory image.

    The full pathname of the program that is being executed by DOS is
located in the environment block.  The segment of the environment block
can be read from the PSP.  It is located at offset [2Ch].

    The name of the program is the last entry in the environment block,
and can be located by searching for two zeros.  The next byte after the
two zeros contains the length of the string that follows it.  After the
length is an ASCIIZ string containing the pathname of the current
process.  The following code opens the file being executed:

nish:   mov     es, word ptr ds:[2Ch]   ; segment of environment
       xor     ax, ax
       mov     di, 1
loop_0: dec     di
       scasw
       jne     loop_0

       mov     dx, di
       add     dx, 2                   ; start of pathname
       push    es
       pop     ds
       mov     ax, 3D02h               ; open, read/write access
       int     21h

    Next, we must read in the file (using dos services function 3Fh,
read file or device).  We can read the file into the heap space after
the program, as long as we are sure we will not overwrite the stack. The
sample program in this file reads itself in entirely, but remember, it
is not necessary to do so. It is only necessary to read and compare the
first few bytes.  Also, the program could read itself in blocks instead
of all at once.

    If a file finds itself to be infected, it should report this to the
user.  Remember, even though the file knows it is infected, the virus
has already executed.  Memory resident viruses will already have loaded
themselves into memory, and direct action viruses will already have
infected other files on the drive.  Thus, any virus that employs
disinfection on the fly will be able to avoid detection and removal.
Here is the full source to the self checking program:


;();();();();();();();();();();();();();();();();();();();();()

model tiny
code
org 100h

start:  mov     es, word ptr ds:[2Ch]   ; dos environment block
       xor     ax, ax
       mov     di, 1
loop_0: dec     di
       scasw
       jne     loop_0

       mov     dx, di
       add     dx, 2                   ; <- point to current
       push    es                      ;    process name
       pop     ds
       mov     ah, 3Dh                 ; open file with handle
       int     21h
       jc      bad                     ; error opening file ?
       mov     bx, ax

       push    cs
       push    cs
       pop     es
       pop     ds                      ; I am a com file.

       mov     cx, heap - start        ; length
       lea     dx, heap                ; where to read file into
       mov     ah, 3Fh                 ; read file or device
       int     21h
       jc      bad                     ; error reading file ?

       ; here, do a byte for byte compare
       lea     si, start
       lea     di, heap

       repe    cmpsb                   ; compare 'em
       jne     bad

       lea     dx, clean
       mov     ah, 9
       int     21h
       jmp     quit_

bad:    mov     ah, 9
       lea     dx, infected
       int     21h

quit_:  mov     ax, 4C00h
       int     21h

clean    db 'Self check passed.$'
infected db 'Self check failed.  Program is probably infected.$'

heap:

end start

;();();();();();();();();();();();();();();();();();();();();()


    While some self checking routines opt to use a crc or checksum
error detection method, the byte for byte method is both faster and more
accurate.

    Weak points: This routine will not work against a stealth virus
which employs disinfection on the fly.  Such viruses take over the dos
interrupt (int 21) and disinfect all files that are opened and read
from.  As the routine in this article attempts to read itself into
memory, the stealth virus would disinfect it and write an uninfected
copy to ram.  Of course, there are ways to defeat this.  If this program
were to use some sort of tunneling, it could bypass the stealth virus
and call DOS directly.  That way, infections by even the most
sophisticated viruses would be detectable.


Disinfection:

    So, now you can write programs that will detect if they have been
infected.  How about disinfection?  This too is possible.  Most viruses
simply replace the first three bytes of the executable file with a jump
or a call, which transfers control to the virus code. Since only the
first three bytes are going to be changed (in almost all cases), it will
usually be possible for a program to disinfect itself by replacing the
first three bytes with what is supposed to be there, and then truncating
itself to the correct size.  The next program writes the entire memory
image to disk, rather than just the first three bytes.  That way, it can
be used to disinfect itself from all nonstealth viruses.

    The steps to disinfect are simple.  First of all, you must move the
file pointer back to the beginning of the file.  Use interrupt 21,
ah=42h for this.  The AL register holds the move mode, which must be 00
in this case (move from beginning of file).  CX:DX holds the 32bit
number for how many bytes to move.  Naturally, this should be 0:0.

    The second step is to write back the memory image to the file.
Since the virus has already restored the first few bytes of our program
in memory, we must simply write back to the original file, starting from
100h in the current code segment.  i.e.:

       mov     ah, 40h
       mov     cx, heap - start ; bytes to write
       lea     dx, start
       int     21h              ; write file or device

    Finally, we must truncate the file back to its original size.  To
truncate a file, we must move the file pointer to the end and call the
'write file or device' function with cx, the bytes to write, equal to
zero. To move the pointer, do this:

       mov     ax, 4200h
       mov     cx, (heap - start) SHR 16     ; high word of file ptr
       mov     dx, (heap - start)            ; low word of file ptr
       int     21h                           ; move file pointer


    Since we are dealing with .COM files here, it is safe to assume
that cx, the most significant word of the file ptr, can be set to zero,
because our entire file must fit into one segment.  We do not need to
calculate it as above.

    To truncate:

       xor     cx, cx
       mov     ah, 40h
       int     21h             ; truncate file

    The full code for the self disinfecting program follows.


;();();();();();();();();();();();();();();();();();();();();()

model tiny
code
org 100h

start:  mov     es, word ptr ds:[2Ch]   ; segment of environment
       xor     ax, ax
       mov     di, 1
loop_0: dec     di
       scasw
       jne     loop_0

       mov     dx, di
       add     dx, 2
       push    es
       pop     ds
       mov     ax, 3D02h               ; open, read/write access
       int     21h
       mov     bx, ax                  ; handle into bx
       push    cs
       push    cs
       pop     es
       pop     ds
       mov     cx, heap - start
       lea     dx, heap
       mov     ah, 3Fh                 ; read file or device
       int     21h
       jc      quit_                   ; can't read ?

       lea     si, start
       lea     di, heap

       repe    cmpsb                   ; byte for byte compare
       jne     bad

       lea     dx, clean               ; we are golden
       mov     ah, 9                   ; print string
       int     21h
       jmp     main_program

bad:    mov     ah, 9                   ; we are infected
       lea     dx, infected
       int     21h

       lea     dx, disinfection
       int     21h

       ; now, disinfect.  File handle is still in bx
       ; we must move the file pointer to the beginning
       xor     cx, cx
       xor     dx, dx
       mov     ax, 4200h
       int     21h             ; move file pointer

       mov     ah, 40h         ; 40hex!
       mov     cx, heap - start
       lea     dx, start
       int     21h             ; write file or device
       jnc     success

       lea     dx, not__
       mov     ah, 9
       int     21h
success:mov     ah, 9
       lea     dx, successful
       int     21h

       xor     cx, cx
       mov     ah, 40h         ; 40hex!
       int     21h             ; truncate file

main_program:

quit_:  mov     ax, 4C00h
       int     21h

disinfection  db 0Dh, 0Ah, 'Disinfection $'
not__         db 'not '
successful    db 'successful.$'

clean         db 'Self check passed.$'
infected      db 'Self check failed.  Program is probably '
             db 'infected.$'

heap:

end start

;();();();();();();();();();();();();();();();();();();();();()

Weak points: The same weak points that apply above also apply here.
Additionally, the program may, by writing itself back to disk, give the
virus the opportunity to reinfect.  Remember, any memory resident
viruses will already have loaded into memory by the time the program
disinfects itself.  When the program tries to disinfect itself, any
virus that intercepts the 'write file or device' interrupt will
intercept this write and re-infect.  Again, tunneling is the clear
solution.