#[1]chrisfenton.com » Feed [2]chrisfenton.com » Comments Feed
[3]alternate [4]alternate
[5]Skip to content
[6]chrisfenton.com
Home to a cunning artificer
* [7]Home
* [8]Early Projects
+ [9]Homemade Speakers
+ [10]MAGNETube
+ [11]GPS Altimeter
+ [12]FPGA Pong
* [13]Digital Archeology
+ [14]Non-Von 1
+ [15]Homebrew Cray-1A
+ [16]Cray-1 Digital Archeology
+ [17]COS Recovery
+ [18]Cray Digital Archeology: Part 2
+ [19]DD9 Kaypro Edition
+ [20]Exploring Kaypro Video Performance
+ [21]The ZedRipper: Part 1
+ [22]The ZedRipper: Part 2
* [23]Esoteric Computing
+ [24]SeqAlign
+ [25]The FIBIAC
+ [26]The Turbo Entabulator
+ [27]The Numbotron
+ [28]The PixelWeaver
* [29]Microcontroller Projects
+ [30]DIY Laptop v1
+ [31]DIY Laptop v2
+ [32]HANS – The Digital Orrery
+ [33]Electromechanical Lunar Lander
+ [34]Cripple Mr Onion
+ [35]Deep Dish Nine
* [36]About
DD9 Kaypro Edition
Deep Dish Nine goes Retro!
Sometime in the early 2000s I acquired a [37]Kaypro 2/84 computer (side
note: the picture in the wikipedia article with the mis-matched floppy
drives is actually my Kaypro!), and I’ve been meaning to do something
‘interesting’ with it ever since. It’s a nice Z80-based ‘luggable’
computer running the [38]CP/M operating system. General stats:
* Z80 Processor @4 MHz
* 64KB of RAM
* Upgraded to use the ‘Advent TurboROM’
* Upgraded to dual DS/DD 5.25″ Floppy drives (360KB/disk)
* 9″ green CRT supporting 80 columns x 25 rows (or 160×100 pixels in
‘graphics mode’!)
* Two RS-232 serial ports supporting speeds up to 19.2Kbps (one of
which is tricked out with a sweet [39]Wifi232 module, the other
unused)
* One Parallel Port (currently unused)
* Unpopulated internal areas to support a real-time clock, and a 300
baud modem!
The TurboROM also adds support for the following (if you can find them,
and/or figure out how to wire them in):
* Up to 2 RAM Disks of 256KB, 512KB or 1MB each
* Support for up to 4 floppy disk drives (180KB, 360KB or 720KB)
* Support for up to 2 Hard Drives of up to 56MB each
A Kaypro 2/84 in all its green-phosphor glory
All in all it’s a pretty neat machine, and it’s an interesting example
of the early computer world before Apple and IBM-compatible machines
killed off all the competition. One of the nice things about a Z80 CP/M
machine like this is that there’s actually a semi-viable software
ecosystem lovingly archived and available fore free on the internet.
You can find lots of ‘productivity’ software (spreadsheets, text
editors, etc.), terminal programs, text-based adventure games like Zork
and even development software like compilers. My recent forays ([40]1,
[41]2) back into game development got me thinking – how hard would it
be to write a ‘graphical’ game for my Kaypro? I stumbled across one or
two drawing programs for the Kaypro, but almost nothing else that
actually took advantage of its limited graphics capabilities.
I finally decided I was going to make an attempt to fill this niche and
see what this old dinosaur can do. Unfortunately my free time is a bit
more limited these days, so rather than start from scratch I decided to
see if I could port the “[42]Deep Dish 9” game I developed for my
[43]Arduboy. On paper the Arduboy and the Kaypro aren’t actually *that*
far off in terms of specs, so it seemed like a natural fit. I also
wanted to actually develop my game *on* the Kaypro, so after stumbling
across [44]this article (and [45]this one), I decided that TurboPascal
3.0 would be my weapon-of-choice.
A surprisingly modern language
My Wifi232 (along with a copy of Kermit on both sides!) makes moving
files between my linux laptop and my Kaypro a snap, so I whipped up a
new floppy with a copy of Kermit and TurboPascal on it, and got to
work. The fantastic thing about TP is that not only is it a perfectly
reasonable programming language, the software is a full IDE that fits
in 26KB! It includes a great text editor as well as a compiler with
nice debugging features – seriously, programmers these days could learn
a thing or two about writing compact software. Imagine a full IDE that
fits comfortably in your CPU’s L1 cache?? That was actually one of the
biggest surprises of this experience – the edit-compile-debug loop that
anyone writing software is so familiar with is really nice on this
machine. It’s probably better than my ‘modern’ programming experience
when I’m working on less-retro things. Sure compiling can take a few
seconds (probably <15s or so), although this would be mostly alleviated
by a RAM Disk, but the 80-column screen is nice and spacious, the
mechanical keyboard is nice and clicky, and the mono-tasking nature of
working on a computer with 64KB of RAM means no distractions. I had
always assumed that writing software on the Kaypro would be extremely
unpleasant at best, so this was a nice surprise.
The Turbo Pascal startup screen
Getting back to the actual porting process – I started by literally
translating the source code from C++ to TurboPascal. It turns out very
little re-factoring was necessary to make this happen, although I had
to stop frequently to look up the right syntax (some of which feels a
bit odd by modern standards). TurboPascal feels like a decently
‘modern’ language, though, and it didn’t take very long to get
comfortable with it.
Working on the DD9 code
I started working on this a few weeks after my daughter was born, so it
worked well as a quick hobby project that could be worked on during nap
times (one more advantage of the Kaypro – it boots up in about 4
seconds!).
Me trying to type quietly on a mechanical keyboard
Once I had the core of the game in place and began testing, it
immediately became apparent why there aren’t more graphical Kaypro
games. The straight port of DD9, which runs comfortably at 30 FPS on my
Arduboy (re-drawing the screen from scratch between frames!), ran at an
achingly slow 0.25 FPS. Eek! The Z80 runs at 4 MHz, vs the 16 MHz of
the Arduboy, and the CRT is controlled by a 6545A-1 “CRT Controller”
chip vs the 128×64 SPI display used by the Arduboy, but I didn’t expect
the performance discrepancy to be that bad. How do these machine’s
actually compare?
* CPU: I’m not super familiar with the Z80, but it appears that it
takes somewhere between 3 and 18 clock cycles to execute a single
instruction. The Atmega 32u4 in the Arduboy, on the other hand, is
both 4X faster and executes about 1 instruction per cycle, giving
it an advantage of somewhere between 12X-72X. The core of the DD9
engine started out as the N-body simulator used by [46]HANS, and
was originally designed for accuracy and simplicity, not speed. The
code makes liberal use of floating point and makes no effort to
reduce the number of calculations needed per iteration.
* Display: The Arduboy has an OLED display connected via SPI with an
SSD1306 controller (like [47]this one), that can probably push
pixels at close to the 4 MHz SPI clock rate. The Kaypro has a
[48]6545-1 CRT Controller, which is really designed to control a
text-based terminal, with a few fancy upgrades. For a variety of
reasons, which I’ll explore next, using the Kaypro’s graphics is
extraordinarily slow. Whereas the Arduboy could probably push a
million+ pixels a second, the Kaypro is probably closer to a few
10s of pixels a second. That makes creating an interactive,
graphical game significantly more challenging.
* RAM: They Kaypro actually has a bit of an advantage here – it’s got
64KB for both code and data, whereas the Arduboy has 32KB for code
and 2.5KB for data. There are ways this could be used to improve
the performance (some of which I mention later), but I mostly
neglected them for a ‘quick & dirty’ approach to optimization
To understand why the performance is so terrible, it’s import to
understand how the Kaypro actually managed to create its ‘graphics
mode.’ The CRT controller is really designed to drive an 80-column x 25
row character display. The CRT controller has 2KB of dedicated
character SRAM used to for the display, as well as an additional 2KB of
‘attribute’ SRAM that corresponds 1:1 with the characters. Each
character supports an extra ‘attribute’ like underlining,
half-brightness, blinking, etc. The CRT controller indexes into the
character and attribute RAMs as it drives the CRT, and the output of
those RAMs index into a character ROM that effectively holds the font
for the display. The Kaypro’s display is actually pretty great for the
early 1980s – the effective display output when driving text is
something like 640×400 pixels – the text is crisp and legible on the 9″
monochrome display. To create the ‘graphics’ mode, the Kaypro can treat
each ‘character’ is a 2×4 block of pixels. It uses the upper 128
character values (since all of ASCII lives in the lower 128 values) to
control 7 of the 8 pixels directly in a 1:1 manner, and then covers the
rest of the possibilities by using the ‘inverse video’ attribute, which
inverts the light/dark values for the whole character block. Voila!
160×100 graphics bolted onto something originally designed to emulate
an ADM3-A terminal.
The Fenton Heavy Industries logo – artisanally crafted with
‘draw_line()’ calls
This would actually be pretty reasonable if it weren’t for the very,
uh, cost-sensitive way in which it were designed. The main issue is
that both the CRT controller and the CPU need to access the character
and attribute SRAMs. Additionally, the CRT controller runs at a max of
2 MHz vs 4 MHz for the CPU. The best way to do this would be to use
dual-ported SRAMs, and allow both devices to access the RAMs whenever
needed, but those were expensive. The 6545 supports other high-ish
performance modes, like using alternate phases of the clock to access
the RAM (but requires fast RAMs or a slow clock). Kaypro went with the
cheapest method, which was to not even allow the CPU to have direct
access at all. To write into a location, the Kaypro writes the data to
a latch, then writes the address to the 6545, and then polls until an
ACK comes back. The 6545 sneaks in CPU-requested accesses during the
horizontal/vertical blanking periods, which right away limits you to
something like 20,000/second. To make the graphics convenient for
programmers, Kaypro added ROM routines to set/clear pixels and
draw/clear lines. Unfortunately, each of these routines needs to do a
read-modify-write on both the character and attribute RAMs for every
pixel it touches! I ported the ‘drawSlowXYBitmap()’ function from the
Arduboy library to the Kaypro, as that’s what I used to display the
logo graphics in DD9, and it is slow indeed! The function is a pretty
tight loop of calling the setPixel() routine, and you can watch the
logo materialize. I didn’t benchmark it carefully, but I basically took
that as a strong sign that I should minimize how many pixels I touch
each frame.
Optimizing!
Now that we’ve framed what a miserably slow computer this is, how did I
go about making it semi-playable? First, I decided that I needed to
minimize the number of floating point operations. I’m both lazy and
pressed for time, so I didn’t move to full fixed-point math, but there
were a number of optimizations available. The main one was moving from
a full n-body simulation to just computing the forces between the
individual planets and the sun, and leaving the sun’s position fixed at
the center of the screen. The inter-planet forces are pretty much
rounding errors on the timescale of the game (90 days), so it
effectively made no difference to the game mechanics and reduced my
floating point requirements for calculating forces from O(N^2) to O(N).
Next, since all of the planets have circular orbits, they effectively
maintain a constant distance from the sun. Since we already reduced our
problem to just computing the forces between individual planets and the
sun, we can go a step further and just compute the magnitude of that
force once for each planet. Now, for each timestep all we need to do is
compute the directional components of the force based on the position
of planets, which is much less work than computing the distance and
magnitude of the force as well.
The main game screen – optimized for the Kaypro!
Finally, since the screen updates are so ridiculously slow, we need to
reduce those to a bare minimum. On the Arduboy, everything is so fast
that I actually blank the screen and re-draw it from scratch on every
frame. On the Kaypro, I divide everything into ‘static’ and ‘dynamic’
elements – the sun and ‘frame’ elements are just drawn at the start of
the game and never updated again. For ‘status’ elements that
dynamically update, I converted them to text (which is both quite fast
to update and very high resolution). For the actual dynamic pixel-drawn
elements (the ship, planets and direction indicator), every update
involves clearing the previously drawn pixels and then re-drawing
whatever new pixels are needed. The movement of the planets is
sufficiently slow that I calculate the new location after every
timestep, and only update the screen if the actual pixel location
changes. As a further simplification, the triangle-shaped ‘ship’ from
the Arduboy game was replaced with a single pixel and a separate
‘direction-indicator’ in the console. The BIOS has draw and clear line
routines which are significantly faster than repeatedly calling the set
/ clear pixel routines, so the indicator line is implemented using
these. The direction indicator also changes in 45 degree increments
rather than the ~1 degree increments used on the Arduboy, in order to
make things sufficiently ‘interactive.’
What did all of this effort get us? As I mentioned earlier, the direct
port of the Arduboy code updated at about 0.25 frames/sec. A few rounds
of optimization got us to the 3-4 frames/sec range, which is actually
playable for a slow game like DD9.
Optimizations Not Taken
The above modifications got me sufficiently close to declare victory
for the purposes of this project, but there were a number of
optimizations I considered along the way. Among those:
* Hardware upgrade #1 – This would have aided development more than
the actual game, but SD-card floppy emulators exist, and they sound
like a huge improvement over the DS/DD 360KB 5.25″ floppy drives I
was using. I actually lost a floppy drive during development and
had to limp across the finish line with only one!
* Hardware upgrade #2 – There were modifications to let the Kaypro
use an 8 MHz Z80 CPU (vs the 4 MHz stock speed), which probably
would have made a significant difference for the math bits, if not
the display. I actually considered taking this to an extreme and
replacing the stock CPU with an FPGA containing a very fast Z80
implementation along with all on-die RAM. I’ve seen people running
Z80 cores at ~80 MHz, which would have certainly sped things up,
but I decided that this would probably have violated the spirit of
the project too much.
* Hardware upgrade #3 – Along the same vein of #2, I considered using
an FPGA to build an ‘accelerator’ for the Kaypro. Conveniently, the
Kaypro has a couple of empty DIP sockets on the motherboard that
wouldn’t actually make this terribly painful (I actually wired an
FPGA implementing a simple SRAM into the socket where the real-time
clock belonged on a rainy afternoon a few years ago, just to test
this concept). I’ve got all of the RTL for my Cray functional units
lying around, so building some kind of unholy “Kraypro” hybrid
crossed my mind a few times. Taking it in a different direction,
building an 8-bit vector subsystem for the Kaypro also sounds like
a super fun project (that, alas, will probably never see daylight).
* Software optimization #1 – The obvious solution to my bad floating
point performance is to move to lower-precision fixed-point math.
Being pre-IEEE754, Turbo Pascal uses a 6-byte ‘real’ format, which
I think consists of a 32-bit mantissa + 15-bit exponent + sign.
Knowing that the Z80 can only operate on 8 bits at a time, and that
each operation takes multiple cycles, it’s not hard to imagine why
doing division with real numbers is super slow. Moving to 32-bit
fixed-point math would probably have sped things up enormously and
not been too difficult.
* Software optimization #2 – Write my own display-update code. By
burning 4KB to keep a shadow copy of the character and attribute
memory in main RAM (less than that, actually, since I only do
dynamic pixel updates to part of the screen), I could enormously
reduce the number of required CRT accesses. As mentioned earlier,
the BIOS routines need to do two read-modify-write operations per
pixel, and each operation can only take place during a CRT blanking
period. If I had shadow copies of the character and attribute
memories in RAM, I could potentially modify up to 8 pixels with a
single access (a 32X improvement!). If I were more serious about my
Kaypro graphical gaming experience, this is probably the right way
to go, particularly because it would be portable to other people’s
machines (unlike hardware upgrades). This could probably be written
as a TurboPascal library that could be re-used by different
programs, and would open up a lot more possibilities for this
elderly computing device. Anyone reading this want to take a stab
at it?
* General hardware upgrades – As I mentioned earlier, the TurboROM
has support for RAM Disks and hard drives, as well as a real-time
clock, and there is an unused serial port that could accommodate a
serial mouse. Significant improvements could also be had by
designing an improved drop-in replacement for the CRT controller.
This quickly devolves into a ship-of-theseus argument frequent
among vintage computing enthusiasts, but as a general computing
platform, there is a lot of low-hanging fruit in terms of upgrades
if you’re not a purist. Having a fully tricked-out Kaypro would be
a really fun project, particularly for eventually writing a full
GUI-based OS for it.
Conclusions
I definitely found the answer to my question about why so few graphical
Kaypro programs exist. The Kaypro’s graphics are awful – it’s a
text-mode machine with graphics bolted on as a box-checking exercise.
That being said, the development experience was surprisingly nice and
it was a lot of fun to go through the exercise of actually making a
functional game for a machine slightly older than me. Have you got an
’84-series Kaypro collecting dust somewhere? Want to test your skill
with pizza delivery and orbital mechanics? Bring it out of retirement
and [49]grab a copy of Deep Dish Nine! Source and executable are
included.
Projects
* [50]About
* [51]COS Recovery
* [52]Cray Digital Archeology: Part 2
* [53]Cray-1 Digital Archeology
* [54]Cripple Mr Onion
* [55]DD9 Kaypro Edition
* [56]Deep Dish Nine
* [57]DIY Laptop v1
* [58]DIY Laptop v2
* [59]Electromechanical Lunar Lander
* [60]Exploring Kaypro Video Performance
* [61]FPGA Pong
* [62]GPS Altimeter
* [63]HANS – The Digital Orrery
* [64]Homebrew Cray-1A
* [65]Homemade Speakers
* [66]MAGNETube
* [67]Non-Von 1
* [68]SeqAlign
* [69]The FIBIAC
* [70]The Numbotron
* [71]The PixelWeaver
* [72]The Turbo Entabulator
* [73]The ZedRipper: Part 1
* [74]The ZedRipper: Part 2
Powered by [75]WordPress and [76]Wellington.
References
Visible links
1.
http://www.chrisfenton.com/feed/
2.
http://www.chrisfenton.com/comments/feed/
3.
http://www.chrisfenton.com/wp-json/oembed/1.0/embed?url=
http://www.chrisfenton.com/dd9-kaypro-edition/
4.
http://www.chrisfenton.com/wp-json/oembed/1.0/embed?url=
http://www.chrisfenton.com/dd9-kaypro-edition/&format=xml
5.
http://www.chrisfenton.com/dd9-kaypro-edition/#content
6.
http://www.chrisfenton.com/
7.
http://www.chrisfenton.com/
8.
http://www.chrisfenton.com/category/uncategorized/
9.
http://www.chrisfenton.com/homemade-speakers/
10.
http://www.chrisfenton.com/magnetube/
11.
http://www.chrisfenton.com/gps-altimeter/
12.
http://www.chrisfenton.com/fpga-pong/
13.
http://www.chrisfenton.com/category/uncategorized/
14.
http://www.chrisfenton.com/non-von-1/
15.
http://www.chrisfenton.com/homebrew-cray-1a/
16.
http://www.chrisfenton.com/cray-1-digital-archeology/
17.
http://www.chrisfenton.com/cos-recovery/
18.
http://www.chrisfenton.com/digital-archaeology-part-2/
19.
http://www.chrisfenton.com/dd9-kaypro-edition/
20.
http://www.chrisfenton.com/exploring-kaypro-video-performance/
21.
http://www.chrisfenton.com/the-zedripper-part-1/
22.
http://www.chrisfenton.com/the-zedripper-part-2/
23.
http://www.chrisfenton.com/category/uncategorized/
24.
http://www.chrisfenton.com/seqalign/
25.
http://www.chrisfenton.com/the-fibiac/
26.
http://www.chrisfenton.com/the-turbo-entabulator/
27.
http://www.chrisfenton.com/the-numbotron/
28.
http://www.chrisfenton.com/the-pixelweaver/
29.
http://www.chrisfenton.com/category/uncategorized/
30.
http://www.chrisfenton.com/diy-laptop-v1/
31.
http://www.chrisfenton.com/diy-laptop-v2/
32.
http://www.chrisfenton.com/hans-the-digital-orrery/
33.
http://www.chrisfenton.com/electromechanical-lunar-lander/
34.
http://www.chrisfenton.com/cripple-mr-onion/
35.
http://www.chrisfenton.com/deep-dish-nine/
36.
http://www.chrisfenton.com/about-2/
37.
https://en.wikipedia.org/wiki/Kaypro
38.
https://en.wikipedia.org/wiki/CP/M
39.
http://biosrhythm.com/?page_id=1453
40.
http://www.chrisfenton.com/cripple-mr-onion/
41.
http://www.chrisfenton.com/deep-dish-nine/
42.
http://www.chrisfenton.com/deep-dish-nine/
43.
https://arduboy.com/
44.
http://techtinkering.com/2013/03/05/turbo-pascal-a-great-choice-for-programming-under-cpm/
45.
http://skookumpete.com/KayproGraphics.htm
46.
http://www.chrisfenton.com/hans-the-digital-orrery/
47.
https://www.adafruit.com/product/938
48.
http://archive.6502.org/datasheets/mos_6545-1_crtc.pdf
49.
http://www.chrisfenton.com/wp-content/uploads/2018/03/dd9_kaypro.zip
50.
http://www.chrisfenton.com/about-2/
51.
http://www.chrisfenton.com/cos-recovery/
52.
http://www.chrisfenton.com/digital-archaeology-part-2/
53.
http://www.chrisfenton.com/cray-1-digital-archeology/
54.
http://www.chrisfenton.com/cripple-mr-onion/
55.
http://www.chrisfenton.com/dd9-kaypro-edition/
56.
http://www.chrisfenton.com/deep-dish-nine/
57.
http://www.chrisfenton.com/diy-laptop-v1/
58.
http://www.chrisfenton.com/diy-laptop-v2/
59.
http://www.chrisfenton.com/electromechanical-lunar-lander/
60.
http://www.chrisfenton.com/exploring-kaypro-video-performance/
61.
http://www.chrisfenton.com/fpga-pong/
62.
http://www.chrisfenton.com/gps-altimeter/
63.
http://www.chrisfenton.com/hans-the-digital-orrery/
64.
http://www.chrisfenton.com/homebrew-cray-1a/
65.
http://www.chrisfenton.com/homemade-speakers/
66.
http://www.chrisfenton.com/magnetube/
67.
http://www.chrisfenton.com/non-von-1/
68.
http://www.chrisfenton.com/seqalign/
69.
http://www.chrisfenton.com/the-fibiac/
70.
http://www.chrisfenton.com/the-numbotron/
71.
http://www.chrisfenton.com/the-pixelweaver/
72.
http://www.chrisfenton.com/the-turbo-entabulator/
73.
http://www.chrisfenton.com/the-zedripper-part-1/
74.
http://www.chrisfenton.com/the-zedripper-part-2/
75.
http://wordpress.org/
76.
https://themezee.com/themes/wellington/
Hidden links:
78.
http://www.chrisfenton.com/wp-content/uploads/2018/03/dd9_title_screen.jpg
79.
http://www.chrisfenton.com/wp-content/uploads/2018/03/Kaypro_wikipedia.jpg
80.
http://www.chrisfenton.com/wp-content/uploads/2018/03/tp3front.jpg
81.
http://www.chrisfenton.com/wp-content/uploads/2018/03/turbo_pascal_main_screen.jpg
82.
http://www.chrisfenton.com/wp-content/uploads/2018/03/turbo_pascal_dd9_edit.jpg
83.
http://www.chrisfenton.com/wp-content/uploads/2018/03/fhi_logo_screen.jpg
84.
http://www.chrisfenton.com/wp-content/uploads/2018/03/dd9_play_screen_2.jpg