Title: Altair Assembler Part 4
Date: November 29 2020
Tags: altair programming
========================================

After completing the hand assembled assembler, the next step was to convert the
code into the assembly that the assembler will be able to assemble for me.  The
process revealed some bugs and missing features.


# Testing #

The first thing I actually did was to stream a test program that just contained
every opcode and a variety of arguments to make sure they got assembled to the
correct value and called the correct callback subroutine to process the
arguments required.  This revealed one error in the opcode table where I
incorrectly converted the ASCII values for the "RPO" opcode string (the very
last one in the test).  The O ended up as 121Q which is a Q.  It should be 117Q.
I think I was trying to be smart and do the math.  O is next to P, add one.
Uh...nope.  <i>Subtract</i> 1. (octal 1, of course).

               '\0'    000
-               'O'     121
+               'O'     117
               'P'     120
               'R'     122

*The upside down string is explained below.

This process also revealed some issues with my plan to stream assembly code from
my laptop as input to the assembler.  All the assembly happens at the end of a
line.  There is a delay while the work gets done where characters from the next
line are lost.  I had expected RTS/CTS to negotiate pauses in data transfer but
that wasn't happening.  After some research, I realized that back in the day,
RTS/CST was unidirectional.  The Altair's 2SIO Serial IO board uses the Motorola
6850 UART which, I guess, is expecting to be talking to a 150 or 300 baud
teletype.  It assumes it'll be faster than the device on the other end and will
wait to send data, but not tell the other device to stop sending data.  So I had
to add something more than a 10,000 microsecond pause between lines.  More was
needed if there was a label which causes the assembler to search and/or write to
the symbol table.  I made the number generous so it would work for longer
programs with a large symbol table.

But, it turned out that a delay at the end of a line wasn't good enough.  As the
symbol table grows, the processing time gets longer to the point if needing
close to a second pause between some lines.  Having to wait that long for every
line, even if they don't need it, would make the transfer process unbearably
slow.  I changed tactics a bit and leveraged the prompt which was there for
human use.  Changing the trailing space to a newline gave me a line to expect in
a script.  So now I send a line, pause for a little then read the output back
until I see the "<" prompt on a line by itself.  The prompt doesn't get
presented until all processing of the line is complete so I know I can fire off
the next line.  Basically, this is a very manual software handshaking process.
More typical would be to use XON/XOFF to do the same thing.  It moves pretty
quickly and assembles the assembler in a reasonable time.  Quick enough that I
haven't bothered to measure it.

The next bug I found was in the subroutine that runs at the end of assembly to
resolved label references to the addresses of labels set later in the code.  I
have to search the symbol table for a label and copy the label name string to
search for references to that label.  I was reading each character, checking for
null (end of string), and if it matched, continuing on.  What I forgot, was to
write that null to the buffer.  I was getting mismatches when a short label was
written over a longer one that left extra characters which never matched any
defined labels.  The fix was just to write the character before checking if it
was null instead of after.

;next char
010 075        MOV     A,M     176     ; get char
-010 076        CPI             376     ; check for null
-010 077                000Q    000
-010 100        JZ              312     ; done get label address
-010 101                111Q    111
-010 102                010Q    010
-010 103        STAX    DE      022     ; store char
+010 076        STAX    DE      022     ; store char
+010 077        CPI             376     ; check for null
+010 100                000Q    000
+010 101        JZ              312     ; done get label address
+010 102                111Q    111
+010 103                010Q    010
010 104        DCX     HL      053     ; decrement struct pointer
010 105        INX     DE      023     ; increment fieldbuf pointer
010 106        JMP             303     ; next char
010 107                075Q    075
010 110                010Q    010

All the above diff shows is one line of code moving but because that upsets the
addresses, it creates a bigger diff.  Diffing hand assembled code with the usual
line based tools is almost impossible.


# Code Conversion #

Once I started converting the code, I ran into a big missing feature.  I had cut
a corner with the DW pseudo-opcode.  DW writes 2 bytes into memory and I only
allowed for an argument that was an EQU or SET, which were limited to only a
single byte, or a word expressed as literal numbers.  Address handling was a
different code path so I didn't include it.  The problem there is, of course,
that I need to write addresses to build lookup tables.  I use lookup tables for
error messages and for the opcodes which have callback addresses in the data
structure.  I had to rewrite the DW callback subroutine and now it supports all
manner of arguments including a label.  I need to go back and also allow words
as arguments to EQU and SET.

Because I was now using DW to create the callback part of the opcode structure,
I had to reverse the bytes from what I had hand assembled previously.  Because I
write the symbol table downwards in memory and I had to have a string matching
subroutine that worked on its reversed strings, I reversed the opcode table in
memory, also, in order to reuse that existing subroutine.  But DW doesn't know
what direction I'll be reading from, it writes the low byte, increments the
memory pointer, then writes the high byte.  A simple high/low flip when reading
the callback address put the bytes back in the right order.

I also had a lot of typos in the assembly code and old assembled bytes I missed
deleting.  Typos in opcodes and arguments hadn't mattered before, just the final
octal byte.  Considering my poor typing, there were surprisingly few.


# Self Assembly #

Once I got assembling, I discovered that if a comment line had a leading space,
that space would be saved as a label.  Two lines like that and it will fail with
a redefined label error.  I had exactly two lines like that.

I also found that some special characters aren't getting parsed correctly.  It
breaks on ',' because a comma is used as a field separator, ';' because
semicolon is the start of a comment, and "\"" which should escape the quote but
doesn't.  Parsing is spread across several subroutines and generally sucks.

Found another hand assembly error where I assembled a literal G to 116Q which is
N in the escaped character handling.  '\N' was therefore matching and writing
007Q which is the bell and that broke all strings with a newline in them.  I
didn't catch that until I tried to run the assembled assembler and the welcome
message was beeping and overwriting itself.

But, assembly was finishing!  And that's when I realized a new problem.  If I
didn't already have the hand assembled version of the assembler, it would be
really hard to debug the code.  You can no longer just jump to the address where
the code you want to check was written to memory.  You won't know where that is
anymore.  For the assembler, I made sure to keep NOPs in there to align the
resulting code with the hand assembled version to spot check and debug the
result.  For any other program, I'm going to be lost.


# Next Steps #

Right away, I need to fix my method for sending code to the assembler.  The
current script eats the output from the assembler so I don't see the error
message when it hits one, nor the full symbol table I dump at the end.  And it
doesn't always exit.  I either need a better script, or working handshaking.  I
can also write the source to memory and assemble it from there.  That just takes
a bit more steps and a bit more memory management so I'm not writing over
things.  This might be the way to go, though, in anticipation of assembling from
a disk file or hosted editor.

I need a way to dump memory back out to store assembled programs.  I don't want
to keep reassembling something I've already assembled.

I'd also like some way of getting the assembly matched up to the source code for
debugging.  I might cheat on this one and do it locally on the laptop.

I want to assemble the assembler to the top of the address space and write it to
a PROM chip for easy reuse.  That will require a little shuffling since some
data storage is currently mixed in with the code.  I think a PROM with my
current ASCII loader, a binary loader, a memory dumper, and the assembler would
be a nice setup.

Oh, and I should probably fix some of the bugs and missing features.