-----------------------------------------------
A collection of Zip bugs and missing features
-----------------------------------------------

(in the light of Graham Nelson's specification V0.2)

Last update: 12-JAN-96

This is a collection of bugs, missing features and possible improvements
for Mark Howell's Zip V2.0. Everything below, especially the term "Zip
port", refers to this distribution only. The numerous ports which have
evolved from this version may have less (or more) bugs. Nevertheless, I
hope this list is useful for anyone involved in Z-code interpretation.

Many thanks to the people who sent me bug reports. If you find a bug
which is not covered by this list send it to

   Stefan Jokisch ([email protected])

This is the third edition. All passages which have been modified or
added in this edition are marked [3].

(i) MORE prompts

-> If top_margin > 0 then the MORE prompt might appear before the
  output reaches the bottom line (e.g. beginning of 'Beyond Zork').
-> Even if top_margin = 0 the first MORE prompt in 'Curses' appears
  too early.
-> When timeouts display some lines of text in the lower window, the
  MORE prompt is printed too late.

I propose the following method:

1) The line counter is set to zero when the text window or the
entire screen is erased. 2) The line counter is increased after
every newline in the lower window. 3) The line counter is set to
top_margin after read or read_char instructions (but only if these
instructions weren't terminated by time-out). 4) The MORE prompt
is printed when the line_counter equals the height of the lower
window minus 1. The line counter is reset to top_margin.

(ii) Fonts

-> The runes in Beyond Zork are not displayed.
-> Zip does not return 0 if a font is unavailable.
-> Zip does not implement font 0 (ie. keep current font).

[3] Font changes are allowed in the middle of a word so it is
necessary to store font changes in a buffer.

(iii) Beep sounds

-> The DOS (V2.0) executable at ftp.gmd.de prints character 7
  instead of playing a beep sound.
-> The display_char routine in UNIXIO.C adds 1 to current_col
  whenever a beep sound is played.
-> The sound routine in AMIGAIO.C plays two beep sound instead
  of playing a low-pitched beep sound (sound_effect #2).

The sound routine in OSDEPEND.C passes beep sounds to the general
character output routine which is not a very wise thing to do.
Preferably, there shoud be an interface function called "beep"
that plays a high- or low-pitched beep sound.

(iv) Sample sounds

-> The generic sound routine in OSDEPEND.C plays a beep sound when
  a sample should be played. It should better remain silent.
-> The sound routine in AMIGAIO.C misunderstands the 4th argument
  of the sound instruction and implements it as fade-in/fade-out.
-> The same routine ignores the volume information in V5 games.
-> The same routine gets the sound file format completely wrong.
-> The same routine seems to put the sounds in a queue (I think),
  but this is wrong.

(Note: "The Lurking Horror" sometimes plays several sample sounds
during a single turn. Even worse, it can happen that a "stop sound"
instruction follows closely. This isn't a major problem if the game
is played on an Amiga 500 by Infocom's own interpreter, since the
interpreter is so slow that most of each sample will be played. On
a fast machine, however, a click noise is all that can be heard.
This is annoying because it means that roughly 200KB of sample data
are simply wasted. One way to work around this is to put the sounds
in a queue and play them one after another. However, this is not in
line with the specification (and wouldn't work very well for
"Sherlock"). I suggest the following fix: If a sound instruction is
issued by the game, see if a sample sound has been started in the
same turn (ie. since the player's last input). If so, wait until
the sound finishes then execute the new sound instruction. Of
course, if the current sound is looping forever you have to stop it
sooner or later.)

I suggest that Zip should not leave all the work to the interface.
For instance, a generic sound routine could execute a sound_effect
by calling one of several interface routines: beep, prepare_sample,
start_sample, stop_sample and finish_with_sample. It could reject
the sound instruction if the interface declared itself unable to
play sample sounds. To some extent it could also handle the 4th
parameter of sound_effect.

Another thought: The end of a sample is usually signalled by a
hardware interrupt. This means that the interpreter main loop may
be called at any time, making life _much_ more interesting. This
requires very careful coding, especially whenever global variables
are used.

(v) Encoding and tokenising

-> Zip uses a list of word separators that apply to all Infocom games
  (in addition to the word separators from the game dictionary). This
  list is obsolete if not illegal.
-> Alternate alphabets ("Zork 1 German") are copied to an array. This
  is slightly illegal since the specification claims that all tables
  can be altered.
-> [3] The tokenise opcode ignores byte 1 of the text buffer (which
  holds the number of characters in V5+) and translates the text up
  to the first null character.

Tokenising and encoding are implemented sloppily: the routines
access the memory of the Z-machine directly (without using the
memory access macros). Furthermore, the binary search on the
dictionary is more complicated than it had to be.

The text encoding routine could be simplified if one assumes
that shift-lock characters do not occur in dictionary entries.

(vi) Function keys

-> The screen output gets into disorder when function keys are
  pressed which are not supported by the current game.
-> Transscription does not work properly if function keys are
  used in "Beyond Zork".
-> Transscription does not work properly when function keys are
  defined in "Beyond Zork" (ie. when the player types DEFINE).
-> Under certain circumstances (to do with function keys), Zip
  does not lowercase the first part of the text buffer.

Zip should check the terminating keys table before it stops the
current input action. This task should not be left to the IO
interface. For instance, Zip could offer a routine which can be
used by the interface to check the terminating keys table.

Transcription problem no 1 can only be solved by removing the last
n characters from the transscript file whenever a read instruction
starts with n characters in its text buffer. The one exception are
read instructions that take place in the upper window (type DEFINE
in "Beyond Zork"). These should not affect the transscript file at
all (which would also solve transscription problem no 2).

(vii) Cursor

-> Some ports turn off the cursor in the upper window which leads
  to problems in "Bureaucracy", for instance.

It has been suggested that the interpreter should never move the
cursor in the lower window. Naturally, the interface would have
to be more involved in implementing split_window, set_window and
erase_window, but it would be easy to implement scroll-back.

(viii) Buffer mode

-> In V3, Zip turns off buffering in the upper window (correct
  behaviour is to use buffering all the time).
-> In V4, Zip turns off buffering in the upper window (correct
  behaviour is to apply the buffer mode ("formatting", as Zip
  calls it) to both windows).
-> In V5, Zip correctly turns off buffering in the upper window,
  but turns on buffering when it returns to the lower window
  (correct behaviour is to remember the previous state of buffer
  mode).

(ix) Emergencies

-> The Unix port crashes when the story file does not exist.

The "fatal" routine should make sure the screen is initialised
before it calls reset_screen (or reset_screen could check this
condition itself).

(x) Erasing windows

-> The cursor does not move to the top left after erasing the upper
  window.
-> The screen is not unsplit after erase_window -1.
-> erase_window -2 is not implemented.
-> In a V5 game, the cursor is moved to the upper window after erasing
  the lower window.

Note that the erase_window opcode is specified in the "screen model"
section of the specification. The brief description in the "dictionary
of opcodes" is less precise.

(xi) Input routine

-> Some ports ignore the return key when the input buffer is full.
-> Cursor and function keys confuse the input routine of UNIXIO.C.
-> In some ports, the input can exceed the end of line when the input
  prompt is longer than ">".
-> The input routine does not always read the correct number of
  characters (see the details about read in the specification).
-> [3] In V5+ the interpreter should not add a null character to
  the text buffer.

Similar to the tokenising and encoding routines, the input routine
accesses the memory of the Z-machine directly (ignoring the memory
access macros). A better design would use a local array. The
interface would read the input into this array, and the interpreter
would copy the contents of this array to the text-buffer of the
read instruction afterwards.

(xii) Colours and text styles

-> The text style is reset to Roman (or "normal") when the lower
  window is selected but this is wrong.
-> Some ports erase to the foreground colour in reverse mode.
-> Some ports produce entirely reversed lines when the screen
  scrolls in reverse mode.
-> Some ports think that reverse plus reverse is Roman.
-> Some ports reset the text style when the colours change.
-> Zip does not implement colour 0 (ie. keep same colour).
-> The default colours are not written into the header.

If an interface implements text styles via colour changes then it
should keep track of the current text style and colours. If any of
these are altered by the game, a routine should be called that
calculates the new screen colours by combining the current text
style and colours.

(xiii) Time-outs

-> The time argument of the read opcode is interpreted as the delay
  in seconds (although it gives the delay in 1/10s of a second).
-> The time-out routine is called with one argument.
-> The return value of the read opcode is -1 if it is terminated by
  a time-out (should be 0).
-> The Amiga version gets confused when a timeout occurs while the
  cursor is placed in the middle of the input line.
-> All Zip ports fail to redraw the input line when it is destroyed
  by a time-out routine.

The last bug is not easily fixed. I suggest to use a flag that is
cleared when the time-out routine is called and that is set when
the game prints a newline to the lower window of the screen. After
the time-out routine has finished, the interpreter can check the
flag to find out whether the input line needs redrawing.

The Amiga problem is even more serious. My own "DOS Frotz" rigidly
ignores all time-outs and function keys unless the cursor is placed
at the end of the input line.

(xiv) Printing

-> The fourth parameter of print_table is not supported.
-> Character 0 does not behave as specified.
-> Character 13 does not behave as specified.

The decode_text routine reads bytes from the instruction stream
without using the GET_CODE_*** macros.

(xv) Flags

-> The "boldface available" flag is called CONFIG_MAX_DATA and
  is completely misunderstood.
-> The "fixed font available" flag is missing.
-> The "variable-pitch font is default" flag is missing.
-> The "timed keyboard input available" flag is missing (well, it was
  just recently defined).
-> The "game wants to use mouse" flag is missing.
-> The "screen-splitting available" flag is set in V4 and V5.

It is tiresome to check the fixed font and scripting flags after
every character printed. Since the only sensible way to change
these flags is through the storeb and storew opcodes, it is easy
to recognise changing flags immediately. The interpreter should
react by (de-)selecting fixed-font or transscription respectively.

[3] The specification claims that an interpreter should not change
the state of the transscription flag. However, this policy does not
work for AMFV so the specification will have to be corrected.

(xvi) Catch and throw

-> Throw does not work (the safety check at the start of the routine
  "unwind" prevents all _legal_ attempts to use the opcode).

(xvii) Save and restore

-> The optional parameters are not supported.
-> Restore fails to initialise certain header fields which are marked
  "Rst" in the specification.
-> [2] According to the specification, restore should not print any
  "file not found" messages.

One suggestion to improve the save/restore routines is to use an
OS independent format (ie. to store all 16 bit words in the order
most significant/least significant byte). Another idea is to save
only the differences between the current and the initial state of
memory (thus saving a lot of disk space).

Another suggestion is to include a command-line option for loading
saved game files (like pinfocom does).

(xviii) Token buffer overflow

-> When too many words are typed, Zip reads (tbuf[0]+1) instead of
  (tbuf[0]) tokens...
-> ...prints the "Too many words..." message more than once...
-> ...and sets (tbuf[1]) to a value bigger than (tbuf[0]).

(xix) UNDO

-> Too much memory is allocated for UNDO (enough to save the entire
  resident or "static" memory).

[2] It has been suggested that multiple UNDO should be possible. It
is, and it is also possible to make this feature part of the core
of Zip. Various strategies can be used to reduce the (huge) amount
of memory required for multiple UNDO: see the comment on "Save and
restore" for an idea. [3] However, these strategies tend to consume
considerable CPU time, at least for slow machines.

(xx) Mice

-> Except for AMIGAIO.C, Zip does not offer mouse support.

It would be good if the interpreter itself (ie. not the interface
routines) would write the mouse position into the mouse data table.

(xxi) European characters

-> Currently only German characters are supported (all other
  character codes have been defined in the latest specification).
-> Scripting does not work properly when European characters are
  used (this results from using the isprint macro).
-> The MS-DOS port terminates the input when European characters
  are typed.
-> European characters are not converted to lower case.
-> European characters can confuse word-wrapping (once again, this
  results from using the isprint macro).
-> There is a problem with interpreter number 6 and "Beyond Zork".

Read the specification, section "format of the header", for more
information about the "Beyond Zork" problem.

Currently, the translation of European characters takes place very
early in the output process. I suggest that this should happen at a
low level. For instance, the interface itself can convert European
character codes to appropriate characters of its own font just
before it displays the codes on the screen. The transscription
routines can replace European character codes with plain ASCII
substitutes before they are written to the transscript file. And
finally, if input recording is turned on then the codes can be
converted to strings (eg. "[155]" or "\ae" for a-umlaut) before
they are passed to the command file.

(xxii) Missing opcodes

-> GET_CURSOR is not implemented.
-> NOP makes the interpreter crash.

(xxiii) Byte-sized properties

-> [3] Byte-sized properties are not supported by get_prop and put_prop
  in V4 and above.

[3] Neither Infocom nor Inform games use byte-sized properties in V4
or V5 so this is not really serious. (Although V6 games use byte-sized
properties they never apply get_prop or put_prop to these properties.)

[3] Currently, the specification also defines the behaviour for
properties that are longer than two bytes.

(xxiv) Word-wrapping

-> The generic fit_line routine counts style changes like normal
  characters (so lines with style changes are wrapped to early).
-> The line wrapping algorithm is a strange mixture of fixed-width
  and proportional-width line splitting. Even if a proportional font
  is used, each line may contain only as many characters as a fixed
  font would allow.
-> The proportional-width line splitting doesn't work if the buffer
  is flushed in the middle of the screen (eg. when the font changes).

The word-wrapping algorithm is highly inefficient. I suggest to
modify the algorithm such that it tries to collect a single word
(instead of a complete line). Whenever a word is complete, an
interface routine is called to calculate the width of this word in
screen units. If it doesn't fit into the current line, a newline
must be printed first.

It has been suggested that hyphenated words could be split, too.

(xxv) Random numbers

-> The ANSI random function does not always produce proper random
  numbers.

It would be helpful if some-one could write a generic random number
generator. Also note that Zip does not support the predictable mode
which is suggested by the specification.

(xxvi) Newlines

-> The Unix port calculates a wrong cursor position when newline
  characters are printed in the upper window.
-> When function keys are used, "Beyond Zork" sometimes prints
  newlines to the lower window while buffering is turned off.
  Zip does not handle this situation properly and passes the
  newline to display_char instead of scroll_line.

To fix the latter bug, it suffices to move the (formatting == ON)
condition to the next if-clause in output_new_line.

(xxvii) Recording and playback

-> Zip cannot handle function keys, mouse clicks or timeouts when
  input recording is turned on.
-> Zip does not implement "input_stream 0" ie. every command file
  is played up to the end-of-file.

The behaviour described in the specification (ignore input lines
unless they are terminated by the return key) is quite helpful.
However, to cope with games like "Beyond Zork" and "Border Zone"
it would be ideal to store all input lines plus their terminating
characters (plus coordinates for mouse clicks) in a command file.

(xxviii) Screen dimensions

-> [2] Zip sets certain header values to the left, right, top and
  bottom coordinates. These values should actually be written with
  the screen width and height in units.

(xxix) The Beyond Zork fix

-> [2] A bug in "Beyond Zork" can cause serious trouble. Some Zip
  ports might crash close to the end of the game.

When the player types "BLOW CIRCLET", the game tries to

 get_prop_addr 41460... (or similar)

which results from confusing the address of the dictionary entry
"circlet" with the object id of "circlet". The interpreter may
simply return 0, but it should not crash.

(xxx) copy_table

-> [3] If the 3rd argument of copy_table is positive then the
  opcode behaves like the memmove function. Zip should copy
  forwards if source > dest, and it should copy backwards if
  source < dest. (This feature wasn't used until Journey.)

(xxxi) Arithmetic

-> [3] The div opcode should perform a signed division (contrary
  to the specification)! Otherwise the microwave oven in "Lurking
  Horror" behaves strangely: If its timer is set to x minutes
  plus y <> 0 seconds (eg. 2:30 or 0:01) then it turns all food
  radioactive!
-> [3] Similarly, mod is the remainder after signed division.

(xxxii) More thoughts

As G. Nelson says, it is very helpful to name routines internally
after the opcodes in the specification. In my source code I added
the prefix "z_" to these routine names which makes it very easy
to distinguish opcode implementations from other routines. This
can be pushed even further by adding the prefix "os_" to the names
of all OS depending interface routines.

To make Zip faster, one should consider removing the virtual memory
system. This can result in a Zip version which is twice as fast as
the original. However, the gain depends very much on the OS and the
machine Zip runs on. Basically, fast machines spend most of the CPU
time with output operations, whereas slow machines need a lot of
time for the thinking process. Naturally, small machines may still
require virtual memory as the story files grow in size.

[2] The memory management is not very clever. Zip allocates as much
memory as possible to load the story file. In some cases, this does
not leave sufficient memory for opening files, so saving the game
becomes impossible.

[2] The specification asks interpreter writers to make the addition
of new characters, fonts and opcodes easy.

[2] See also Graham Nelson's notes on error messages (appendix A).