Some impressions of DynASM

* * * * *

Some impressions of DynASM

I'm curious to test something [1] and, odd as it may seem, the best way to do
this (in my opinion) was to try using DynASM [2], the dynamic assembler used
by LuaJIT [3] (of which it is a part, but can be used separately from,
LuaJIT). The official document is lacking somewhat, so I've been following a
tutorial [4] (along with the tutorial source code [5]) for my own little
project.

I will not be re-covering that ground here (that, and the The Unofficial
DynASM Documentation [6] should be enough to get you through using it if you
are interested in it) but I will give a brief overview and my impressions of
it.

DynASM is used to generate code, specified as assembly, at runtime, not at
compile time. As such, you give the code you want to compile in your program
thusly:

> if (token.type == TOKEN_NUMBER)
> | mov ax,token.value
> else if (token.type == TOKEN_VARIABLE)
> | mov ax,[g_vars + token.value]
>

All this code does is generate different code depending on if the given token
is a number or a variable. The DynASM statements themselves start with a “|”
(which can lead to issues if you aren't expecting it [7]) and in this case,
it's the actual assembly code we want (more assembly code can be specified,
but it's limited to one assembly statement per line). Once we have written
our program, the C code needs to be run through a preprocessor (the actual
DynASM program itself—written in Lua) and it will generate the proper code to
generate the proper machine code:

> if (token.type == TOKEN_NUMBER)
> //| mov ax,token.value
> dasm_put(Dst, 3, token.value);
> #line 273 "calc.dasc"
> else if (token.type == TOKEN_VALUE)
> //| mov ax,[g_vars + token.value]
> dasm_put(Dst, 7, g_vars + token.value);
>

The DynASM state data, in this case, Dst, can be specified with other DynASM
directives in the code. It's rather configurable. You then link against the
proper runtime code (there are versions for x86 [8], ARM [9], PowerPC [10] or
MIPS [11]) and add some broiler-plate code [12] (this is just an example of
such code) and there you go.

It's an intriguing approach, and the ability to specify normal looking
assembly code is a definite plus. That you have to supply different code for
different CPU (Central Processing Unit)s is … annoying but understandable
(you can get around some of this with judicious use of macros and defines but
there's only so much you can hide when at one extreme, you have a CPU with
only eight registers and strict memory ordering [13] and at the other end,
CPUs with 32 registers and not-so-strict memory ordering [14]). The other
thing that really bites is the use of the “|” to denote DynASM statements.
Yes, it can be worked around, but why couldn't Mike Pall (author of LuaJIT)
have picked a symbol not used by C for this, like “@” or “$”? Unfortunately,
it is what it is.

Overall, it's rather fun to play with, and it was pretty easy to use, spotty
documentation notwithstanding.

[1] gopher://gopher.conman.org/0Phlog:2015/09/05.2
[2] http://luajit.org/dynasm.html
[3] http://luajit.org/
[4] http://blog.reverberate.org/2012/12/hello-jit-world-joy-of-simple-jits.html
[5] https://github.com/haberman/jitdemo
[6] http://corsix.github.io/dynasm-doc/reference.html
[7] https://github.com/LuaJIT/LuaJIT/issues/73
[8] https://en.wikipedia.org/wiki/X86
[9] https://en.wikipedia.org/wiki/ARM_architecture
[10] https://en.wikipedia.org/wiki/PowerPC
[11] https://en.wikipedia.org/wiki/MIPS
[12] https://github.com/haberman/jitdemo/blob/master/dynasm-driver.c
[13] https://en.wikipedia.org/wiki/Memory_ordering
[14] https://en.wikipedia.org/wiki/Memory_ordering#In_symmetric_multiprocessing_.28SMP.29_microprocessor_systems

Email author at [email protected]