Title: 8800c Problems
Date: October 15, 2020
Tags: hardware altair
========================================

I haven't posted much as I've been enjoying working on my 8080 assembler on the
8800c.  Recently, however, I've had a bit of an unplanned break.

I've spent the last couple of weeks troubleshooting some hardware issues with
the 8800c and it's broken my focus on the assembler so before I figure out where
I left off there, I'll try to get a post out describing what's been happening.

When I first finished building the 8800c and started testing it out, I had a
weird issue with the Examine switch not working correctly.  I didn't understand
what it was doing at the time, but some jiggling of the cards in their slots and
it went away for months.

Things were trouble free for some time.  I had burned my loader onto an EPROM
chip and installed it and had continued development of my assembler.  Then the
first weirdness happened that kicked off several confusing weeks.


# Not So Read-Only Memory #

At some point, the loader stopped working correctly.  It took me a little bit to
notice that the first byte had changed.  That's....not great from an EPROM chip.
The 88-2SIOJP serial card, which also holds the PROM chip, has the ability to
write to certain EPROM chips and I had purchased a compatible chip to use this
feature.  However, I was not able to get it to work so I disabled the write
ability (I think) and just used a chip burner (which was it's own adventure).
When the byte changed, I thought the chip might be bad so I burned a new one and
swapped it in.

At this point, the Examine switch issue resurfaced.  Some card wiggling seemed
to help but it would quickly come back and it got more frequent until nothing I
did would make it go away.  Whatever the root cause, it seemed to have become
permenant.


# The Front Panel Issues #

What I noticed was happening, is that Examine would always go to address
000070Q.  Deposit would always deposit a 377Q.  Examine Next would do a stack
operation (push 2 bytes) then go to 000070Q.

That 377Q being Deposited is significant.  That means the data bus is all 1's.
But contrary to what you might think, that means nothing is putting data onto
the bus.  The data bus has pull up resistors so the bus lines default to high
unless something is driving data.

Examine works by the front panel taking control of the data bus, forcing a JMP
instruction onto it, which is 303Q, letting the CPU execute it, then putting the
address bytes onto the data bus as read off the switches so the CPU will jump to
the address specified by the operator.  The memory card will then put it's data
onto the bus as normal and the front panel data lights show what's at that
address.

Examine Next, forces a NOP (000Q) onto the bus which causes the CPU to go to the
next address.  A single instruction step instead of the 3 it takes to do an
Examine.

So you can see how a data bus stuck at 377Q will effect Examine and Examine
Next.  They cause the CPU to execute instructions that are on the data bus.  If
the data bus is 377Q, that's a RST 7 instruction.  Restart instructions are used
by interrupts to to call an interrupt routine but it works basically like a CALL
with hard coded addresses.  RST 7 will push the stack pointer onto the stack
(remember Examine Next doing stack operations) then jump to the address
specified in the argument.  So a RST 7 goes to 000070Q, a RST 3, would jump to
000030Q, etc. from 0 - 7.

I probed around with the oscilloscope to see if the data lines were stuck
somehow.  But if address 070Q had something in it, I could see it fine on the
front panel.  But, turns out when the front panel forces data onto the data bus
for Examine or Examine Next, it uses the wires that directly connect the front
panel to the CPU card.  That's how it gets around fighting other things for
control of the data bus.  Just tell the CPU card to point the CPU at your
special data bus instead of the system bus on the backplane.

With help from Altair master Mike Douglas, we discovered that the front panel
seemed to be doing the right things but the CPU's DBIN signal (Data Bus IN),
which is strobed to read the data bus, was 'sticking' on instead of strobing to
read the JMP instruction and address.  While it's stuck on, it's inhibiting the
front panel from driving the special data lines so the default value of 377Q is
seen by the CPU.

I don't know why this is happening.  I'm suspicious of the CPU itself as the CPU
generates this signal on it's own.  I don't believe it's externally triggered.
The clock is a common issue but I had solved that problem already and it is
still in spec and looks rock solid on the scope.

And then it went away.

Like on my initial 8800c start up, the problem went away on it's own.  It still
happens once in a while and I can tap and jiggle cards and it goes away again.


# The PROM Issue #

After Examine started working again and I was able to jump to the starting
address of my loader in the PROM and run it, a new issue revealed itself.

While trying to load my assembler code, it would either error out, claiming it
read an invalid octal character, or would suddenly start executing the partially
transferred assembler which usually ended up printing garbage endlessly.

This eventually got bad enough that as soon as I would start running the loader,
it would end up in lower memory.  This didn't occur when single stepping through
the code.

What was happening was clear.  It was occasionally getting a 377Q on the data
bus causing a RST 7 and jumping to address 070Q which happened to be in the
middle of a subroutine that would print starting at whatever address was in HL
and repeated RST 7 instructions meant it would just keep printing junk.

Occasionally it was getting the 377Q on a data read which is an invalid
character for anything ACSII, let alone the octal digits.

But if the front panel seemed to be working, where was it coming from this time?

Remember when I changed the PROM chip?  That was the start of this mess.

I put the AMON PROM that came with the 88-2SIOJP card back in and was able to
run the monitor and run the memory test without issue.  So I put my original
PROM chip back in, that I thought was bad because the first byte changed, and it
also worked fine.

I had reburned that chip before deciding not to use it, assuming it might be
bad.  I guess, it turns out the replacement chip might actually be the bad one.

It isn't clear to me in what way this PROM chip is bad.  Is it somehow effecting
the CPU such that the DBIN signal is incorrect?  I am not sure that's possible.
It is sometimes not driving the data bus?  Maybe the chip's timing is on the
edge for the Altair's timing.  Multiple issues at once is a tough puzzle to
decipher.


Well for the moment, I have a functional Altair again.  I was able to load, run,
and do some debugging on my assembler again.  I think I need to get a spare CPU,
and maybe find a way to test PROM chips.