* * * * *

    A branchless segment of code to generate a printable hexadecimal value

I was an avid fan of assembly language back in my youth and I did a lot of
it. And in that time, if I needed to convert a 4-bit quantity to a
hexadecimal character, I would write the obvious code:

-----[ Assembly ]-----
       ; x86 code
               add     al,'0'
               cmp     al,'9'  ; if '0'-'9', no adjustment needed
               jbe     skip    ; otherwise, we need to adjust
               add     al,7    ; the resulting character by 7
                               ; to get 'A'-'F'
skip:
-----[ END OF LINE ]-----

Eight bytes and a branch instruction, and not many ways I could see to
improve on that, until the other day when I came across this bit of code:

-----[ Assembly ]-----
       ; x86 code
               add     al,90h
               daa
               adc     al,40h
               daa
-----[ END OF LINE ]-----

Not only does this convert a 4-bit value to a hexadecial character, but it's
two bytes shorter and it's branchless!

Now, some might say this abuses the DAA (Decimal Adjust after Addition)
instruction, but it works. And how it works is pretty clever I think. The DAA
instruction exists to allow BCD (Binary Coded Decimal) [1] arithmetic (back
when it was a thing). For each 4-bits in a byte, the DAA instruction will
check to see if it's in the range of 10 to 15 and if so, add 6 to that 4-bit
value to bring it back into the 0 to 9 range, and propagate a carry bit
(well, it's a bit more involved than that, but that will suffice for this
post—you can check my MC6809 emulator [2] for the gory details of the DAA
instruction). By adding 0x90 (or 144 in decimal) to a 4-bit value then using
DAA, a carry bit will be propagated if the initial value was 10 to 15;
otherwise there's no carry to propagate. The ADC (Add with Carry) of 0x40 (or
64 decimal) will then add any carry of the previous two instructions into the
lower four bits of the result, and the DAA will then adjust the upper 4-bits
to be either 0x3 or 0x4 due to the previous addition of 0x90 (which causes
the number to act like a negative number [3] if the initial value was bewteen
0 and 9). And because of the carry if the initial 4-bit value was between 10
to 15, you get the required adjustment of 7 needed for values of 10 through
15.

This means the result is 0x30 to 0x39 (the ASCII values of “0” to “9”) of the
4-bit values of 0 through 9, or 0x41 to 0x46 (the ASCII values of “A” to “F”)
for values 10 through 15.

Quite ingenious really.

I found reference to what may be the origin of this sequence: the article “A
Design Philosophy for Microcomputer Architectures [4]” from the February 1977
edition of Computer (the code appears on the third page of the article), but
it's unclear if the author came up with this on his own, or it was a known
sequence at the time.

I just wish I found out about it earlier.

[1] https://en.wikipedia.org/wiki/Binary-coded_decimal
[2] https://github.com/spc476/mc6809/blob/14a2690f617d6e2c39948a8390d8a309d5bdcde8/mc6809.c#L450
[3] gopher://gopher.conman.org/0Phlog:2015/06/16.1
[4] https://www.computer.org/csdl/magazine/co/1977/02/01646378/13rRUEgs2vU
---

Discussions about this page

A branchless segment of code to generate a printable hexadecimal value | Lobsters
 https://lobste.rs/s/b1zoek

A branchless segment of code to generate a printable hexadecimal value | Hacker News
 https://news.ycombinator.com/item?id=34934417

Email author at [email protected]