A quick dip back into assembly with some curious results …

* * * * *

A quick dip back into assembly with some curious results …

Speaking of assembly [1] …

One of the instructions of the x86 architecture [2] I've been curious about
is ENTER [3]. Oh, I know it's there to support higher level languages like C
and Pascal that use stack frames for local variables. It even supposedly
supports nested function definitions (ala Pascal) using the second operand as
a kind of “nesting level.”

But I've never seen an actual instance of ENTER used with a “nesting level”
greater than 0. The only instance I've ever seen used has been

> ENTER n,0
>

Which is equivilent to

> PUSH EBP ; or BP if 16-bit code
> MOV EBP,ESP
> SUB ESP,n
>

(And in fact, that sequence is generated by GCC (GNU Compiler Collection) [4]
as it's actually faster than ENTER n,0 and C doesn't allow nested functions
to begin with.)

But being curious about what ENTER actually does, I decided to play around
with it. I wrote some simple code:

> bits 32
> global sub0
> extern pmem
>
> section .text
>
> sub0 enter 8,0
> mov eax,0DEADBEEFh
> mov [ebp-4],eax
> mov eax,0CAFEBABEh
> mov [ebp-8],eax
> lea ebx,[ebp+4]
> push dword 0c0000001h
> call sub1
> leave
> ret
>
> sub1 enter 8,1
> mov eax,0DEADBEEFh
> mov [ebp-4],eax
> mov eax,0CAFEBABEh
> mov [ebp-8],eax
> push dword 0c0000002h
> call sub2
> leave
> ret
>
> sub2 enter 8,2
> mov eax,0DEADBEEFh
> mov [ebp-4],eax
> mov eax,0CAFEBABEh
> mov [ebp-8],eax
> push dword 0c0000003h
> call sub3
> leave
> ret
>
> sub3 enter 8,3
> mov eax,0DEADBEEFh
> mov [ebp-4],eax
> mov eax,0CAFEBABEh
> mov [ebp-8],eax
> push dword 0c0000004h
> call sub4
> leave
> ret
>
> sub4 enter 8,4
> mov eax,0DEADBEEFh
> mov [ebp-4],eax
> mov eax,0CAFEBABEh
> mov [ebp-8],eax
> push dword 0
> push dword 0
> push ebx
> push esp
> call pmem
> add esp,16
> leave
> ret
>
>

And the following C code:

> #include <stdio.h>
> #include <stdlib.h>
>
> extern void sub0(void);
>
> void pmem(unsigned long *pl,unsigned long *ph)
> {
> assert(pl < ph);
>
> while(ph >= pl - 2)
> {
> printf("\t%08lX: %08lX\n",(unsigned long)ph,*ph);
> ph--;
> }
> }
>
> int main(void)
> {
> sub0();
> return EXIT_SUCCESS;
> }
>
>

Nothing horribly complicated here. pmem() just dumps the stack, and the
various sub*() routines create deeper nestings of stack activation records
while creating enough space to store two four-byte values. The results
though?

Curious (comments added by me after the run) …

> BFFFFD1C: 0804853C return addr to main()
> BFFFFD18: BFFFFD20 stack frame sub0
> BFFFFD14: DEADBEEF local0
> BFFFFD10: CAFEBABE local1
> BFFFFD0C: C0000001 marker for calling sub1
> BFFFFD08: 08048591 return addr to sub0
> BFFFFD04: BFFFFD18 stack frame sub1
> BFFFFD00: DEADBEEF local0
> BFFFFCFC: CAFEBABE local1
> BFFFFCF8: 08049708 ?
> BFFFFCF4: C0000002 marker for calling sub2
> BFFFFCF0: 080485B1 return addr to sub1
> BFFFFCEC: BFFFFD04 stack frame sub2
> BFFFFCE8: DEADBEEF local0
> BFFFFCE4: CAFEBABE local1
> BFFFFCE0: 00000002 ?
> BFFFFCDC: 400079D4 ?
> BFFFFCD8: C0000003 marker for calling sub3
> BFFFFCD4: 080485D1 return addr to sub2
> BFFFFCD0: BFFFFCEC stack frame sub3
> BFFFFCCC: DEADBEEF local0
> BFFFFCC8: CAFEBABE local1
> BFFFFCC4: BFFFFCD0 ? sf3
> BFFFFCC0: 4000F000 ?
> BFFFFCBC: 02ADAE54 ?
> BFFFFCB8: C0000004 marker for calling sub4
> BFFFFCB4: 080485F1 return addr to sub3
> BFFFFCB0: BFFFFCD0 stack frame sub4
> BFFFFCAC: DEADBEEF local0
> BFFFFCA8: CAFEBABE local0
> BFFFFCA4: BFFFFCD0 ? sf3
> BFFFFCA0: BFFFFCB0 ? sf4
> BFFFFC9C: 40011FE0 ?
> BFFFFC98: 00000001 ?
> BFFFFC94: 00000000 push dword 0
> BFFFFC90: 00000000 push dword 0
> BFFFFC8C: BFFFFC8C ? supposed to be ebx
> BFFFFC88: BFFFFC8C ? supposed to be esp
> BFFFFC84: 08048618 return addr to sub4
>

From my understanding of what ENTER does, each “level” creates a type of
nested stack activation record with pointers to each previous “level's” stack
record. And while each level has the required number of additional entries,
the actual contents don't make sense.

Running this on a different Linux system produced similarly confusing
results. I'm not sure if ENTER is horribly broken these days (I wonder how
often the instruction is actually used), or perhaps, it is indeed a Linux
problem [5]? Not that I'm going to be using assembly any time soon … I'm just
curious.

[1] gopher://gopher.conman.org/0Phlog:2008/07/08.1
[2] http://en.wikipedia.org/wiki/X86_assembly_language
[3] http://www.cs.ucla.edu/~kohler/class/04f-aos/ref/i386/ENTER.htm
[4] http://gcc.gnu.org/
[5] http://groups.google.co.nz/group/comp.os.linux.development.system/browse_thread/thread/a057249198598933/a4f5251c9ef1e7a2?#a4f5251c9ef1e7a2

Email author at [email protected]