Some more usless µbenchmarks checking for integer overflow

* * * * *

Some more usless µbenchmarks checking for integer overflow

Using the INTO instruction to check for overflow was dog slow [1], so what
about using JO (Jump on Overflow)? Will that be slow?

The results speak for themselves (reminder—the expressions are compiled and
run 1,000,000 times):

Table: x = 1 - 0
overflow method time result
------------------------------
true INTO 0.009080000 1
true JO 0.006808000 1
false - 0.005938000 1

Table: x = 1 + 1 + 1 + 1 + 1 + 1 * 100 / 13
overflow method time result
------------------------------
true INTO 0.079844000 46
true JO 0.030274000 46
false - 0.030245000 46

Even though the code using the JO instruction is longer than either version:

> xor eax,eax
> mov ax,0x1
> add ax,1
> jo error
> add ax,1
> jo error
> add ax,1
> jo error
> add ax,1
> jo error
> add ax,1
> jo error
> imul 100
> jo error
> mov bx,13
> cwd
> idiv bx
> jo error
> mov [$0804F58E],ax
> ret
> error: into
> ret
>

it performed about the same as the non-overflow checking version. That's
probably due to the branch prediction having very little overhead on
performance. One thing to notice, however, is that were a compiler to go down
this path and check explicitely for overflow, not only would the code be
larger, but overall it might be a bit slower than normal as there are
commonly used optimizations (at least on the x86 architecture) that cannot be
used. For instance, a cheap way to multiply a value by 5 is to skip the IMUL
instruction and instead do LEA EAX,[EAX*4 + EAX], but the LEA (Load Effective
Address) does not set the overflow flag. Doing three INC EAX in a row is
smaller (and just as fast) as doing ADD EAX,3, but while the INC (INCrement)
instruction does set the overflow flag, you have to check the flag after each
INC or you could miss an actual overflow, which defeats the purpose of using
INC to generate smaller code.

And one more thing before I go, and this is about DynASM [2]—it's not stated
anywhere, but if you use local labels, you have to call dasm_setupglobal()
[3] or else the program will crash. I found this out the hard way.

[1] gopher://gopher.conman.org/0Phlog:2015/09/05.2
[2] http://luajit.org/dynasm.html
[3] http://corsix.github.io/dynasm-doc/reference.html#dasm_setupglobal

Email author at [email protected]