Z-machine Common Save-File Format Standard.
                             also called Quetzal:
          Quetzal Unifies Efficiently The Z-Machine Archive Language
                            version 1.4 (03-Nov-97)

- 1 -   Conventions used within this document, and within the file

1.1     A 'byte' is an 8-bit unsigned quantity.

1.2     A 'word' is a 16-bit unsigned quantity.

1.3     Bitfields are represented as blocks of characters, with the first
       character representing the most significant bit of the byte in
       question. Multi-bit subfields are indicated by using the same character
       multiple times, and values of 0 or 1 indicate that these bits are
       always of the specified value. Therefore a bitfield described as
       010abbcc cccdd111 would be a two-byte bitfield containing four
       subfields, a, of 1 bit, b, 2 bits, c, 5 bits, and d, 2 bits, together
       with a field 'hardwired' to 010 and one to 111.

1.4     All multi-byte numbers are stored in big-endian form: most significant
       byte first, then in strictly descending order of significance.

1.5     The reader is assumed to already be familiar with the Z-machine;
       in particular its instruction set, memory map and stack conventions.

- 2 -   Overall structure

2.1     For the purposes of flexibility, the overall format will be a new IFF
       type. A standard core is defined, and customised information can be
       stored by specific interpreters in such a way that it can be easily
       read by others. The FORM type is 'IFZS'.

2.2     Several chunks are defined within this document to appear in the IFZS
       FORM.

               'IFhd'          5.4
               'CMem'          3.7
               'UMem'          3.8
               'Stks'          4.10
               'IntD'          7.8

2.3     Several chunks may also appear by convention in any IFF FORM:

               'AUTH'          7.2, 7.3
               '(c) '          7.2, 7.4
               'ANNO'          7.2, 7.5

- 3 -   Contents of dynamic memory

3.1     Since the contents of dynamic memory may be anything up to 65534 bytes,
       it is desirable to have some form of compression available as an
       option. Bryan Scattergood's port of ITF uses a method that is both
       elegant and effective, and this is the method adopted.

3.2     The data is compressed by exclusive-oring the current contents of
       dynamic memory with the original (from the original story file). The
       result is then compressed with a simple run-length scheme: a non-zero
       byte in the output represents the byte itself, but a zero byte is
       followed by a length byte, and the pair represent a block of n+1 zero
       bytes, where n is the value of the length byte.

3.3     It is not necessary to compress optimally, if to do so would be
       difficult. For example, an interpreter that does not store the whole
       of dynamic memory in physical memory may compress a single page at a
       time, ignoring the possibility of a run crossing a page boundary;
       this case can be encoded as two adjacent runs of bytes. It is
       required, however, that interpreters read encoded data even if it does
       not happen to be compressed to their particular page-boundary
       preferences. This is not difficult, requiring merely the maintenance of
       a small amount of state (namely the current run length, if any) across
       page boundaries on a read.

3.4     If the decoded data is shorter than the length of dynamic memory, then
       the missing section is assumed to be a run of zeroes (and hence equal
       to the original contents of that part of dynamic memory). This permits
       the removal of redundant runs at the end of the encoded block; again
       it is not necessary to implement this on writes, but it must be
       understood on reads.

3.5     Two error cases are possible on reads: the decoded data may be larger
       than dynamic memory, and the encoded data may finish with an incomplete
       run (a zero byte without a length byte). These should be dealt with in
       whatever way seems appropriate to the interpreter writer.

3.6     Dissenting voices have suggested that compression is unnecessary in
       today's world of cheap storage, and so the format also includes the
       capability to dump the contents of dynamic memory without modification.
       The ability to write such files is optional; the ability to read both
       types is necessary. It is an error for this dump to be shorter or
       longer than the expected length of dynamic memory.

3.7     The IFF chunk used to contain the compressed data has type 'CMem'.
       Its format is as follows:

3.7.1           4 bytes         'CMem'          chunk ID
3.7.2           4 bytes         n               chunk length
3.7.3           n bytes         ...             compressed data as above

3.8     The chunk used to contain the uncompressed data has type 'UMem'. It
       has the format:

3.8.1           4 bytes         'UMem'          chunk ID
3.8.2           4 bytes         n               chunk length
3.8.3           n bytes         ...             simple dump of dynamic memory

- 4 -   Contents of stacks

4.1     One of the biggest differences between current interpreters is how they
       handle the Z-machine's stacks. Conceptually, there are two, but many
       interpreters store both in the same array. This format stores both in
       the same IFF chunk, which has chunk ID 'Stks'.

4.2     The IFF format includes a length field on each chunk, so we can write
       only the used portion of the stacks, to save space. The least recent
       frames on the stacks are saved first, to ensure that the missing part
       appears at the end of the data in the file.

4.3     Each frame has the format:

4.3.1           3 bytes         ...             return PC (byte address)
4.3.2           1 byte          000pvvvv        flags
4.3.3           1 byte          ...             variable number to store result
4.3.4           1 byte          0gfedcba        arguments supplied
4.3.5           1 word          n               number of words of evaluation
                                               stack used by this call
4.3.6           v words         ...             local variables
4.3.7           n words         ...             evaluation stack for this call

4.4     The return PC is a byte offset from the start of the story file.

4.6     The p flag is set on calls made by CALL_xN (discard result), in which
       case the variable number is meaningless (and should be written as a
       zero).

4.7     Assigning each of the possible 7 supplied arguments a letter a-g in
       order, each bit is set if its respective argument is supplied. The
       evaluation stack count allows the reconstruction of the chain of frame
       pointers for all possible stack models. Words on the evaluation stack
       are also stored least recent first.

4.8     Although some interpreters may impose an arbitrary limit on the size of
       the stacks (such as ZIP's 1024-word total stack size), others may not,
       or may set larger limits. This means that the size of a stack dump may
       be larger than will fit. If you cannot dynamically resize your stack
       you must trap this as an error.

4.9     The stack pointer itself is not stored anywhere in the save file,
       except implicitly, as the top frame on the stack will be the last
       saved.

4.10    The chunk itself is simply a sequence of frames as above:

4.10.1          4 bytes         'Stks'          chunk ID
4.10.2          4 bytes         n               chunk length
4.10.3          n bytes         ...             frames (oldest first)

4.11    In Z-machine versions other than V6 execution starts at an address
       rather than at a routine, and therefore data can be pushed on the
       evaluation stack without anything being on the call stack. Therefore,
       in all versions other than V6 a dummy stack frame must be stored as
       the first in the file (the oldest chunk).

4.11.1  The dummy frame has all fields set to zero except n, the amount
       of evaluation stack used. Note that this may also be zero if the
       game does not use any evaluation stack at the top level.

4.11.2  This frame must be written even if no evaluation stack is used at
       the top level, and therefore interpreters may assume its presence on
       savefiles for V1-5 and V7-8 games.

- 5 -   Associated Story File

5.1     We now come to one of the most difficult (yet most important) parts of
       the format: how to find the story file associated with this save file,
       or the related (but easier) problem of checking whether a given save
       file belongs to a given story.

5.2     Considering the easier second problem first, the actual name of the
       story file is often not much use. Firstly, filenames are highly
       dependent on the operating system in use, and secondly, many original
       Infocom story files were called simply 'story.data' or similar.

5.3     The method most existing interpreters use is to compare the variables
       at offsets $2, $12, and $1C in the header (that is, the release number,
       the serial number and the checksum), and refuse to load if they differ.
       These variables are duplicated in the file (since the header will be
       compressed with the rest of dynamic memory).

5.4     This data will be stored in a chunk of type 'IFhd'. This chunk must
       come before the [CU]Mem and Stks chunks to save interpreters the
       trouble of decoding these only to find that the wrong story file is
       loaded. The format is:

5.4.1           4 bytes         'IFhd'          chunk ID
5.4.2           4 bytes         13              chunk length
5.4.3           1 word          ...             release number ($2 in header)
5.4.4           6 bytes         ...             serial number ($12 in header)
5.4.5           1 word          ...             checksum ($1C in header)
5.4.6           3 bytes         ...             PC (see 5.8)

5.5     If the save file belongs to an old game that does not have a checksum,
       it should be calculated in the normal way from the original story file
       when saving. It is possible that a future version of this format may
       have a larger IFhd chunk, but the first 13 bytes will always contain
       this data, and if the other chunks described herein are present they
       will be guaranteed to contain the data specified.

5.6     The first problem (of trying to find a story file given only a save
       file) cannot really be solved in an operating-system independent
       manner, and so there is provision for OS-dependent chunks to handle
       this.

5.7     It should be noted that the current state of the IFhd chunk means
       it has odd length (13 bytes). It should, of course, be written with
       a pad byte (as mentioned in 8.4.1).

5.8     The value of the PC saved in the chunk depends on the version of the
       Z-machine which the story runs on.

5.8.1   On Z-machine versions 3 and below, the SAVE instruction takes a
       branch depending on the success of the save. The saved PC points to
       the one or two bytes which describe this branch.

5.8.2   On versions 4 and above, the SAVE instruction stores a value
       depending on the success of the save. The saved PC points to the single
       byte describing where to store the result.

5.8.3   This behaviour differs from that specified by previous versions of this
       standard, but the behaviour expected there would be difficult to
       implement in existing interpreters. The situation has been complicated
       as the patches available for the Zip interpreter did not correctly
       implement the previous standard; instead, they behaved as specified
       here.

- 6 -   Miscellaneous

6.1     It must be specified exactly what the magic cookie returned by CATCH
       is, since this value can be stored in any random variable, on the
       evaluation stack, or indeed anywhere in memory.

6.2     For greatest independence of internal interpreter implementation, CATCH
       is hereby specified to return the number of frames currently on the
       system stack. This makes THROW slightly inefficient on many
       interpreters (a current frame count can be maintained internally to
       avoid problems with CATCH), but this is unavoidable without using two
       stacks and a fixed-size activation record (always 15 local variables).
       Since most applications of CATCH/THROW do not unwind enormous depths,
       (and they are somewhat infrequent), this should not be too much of a
       problem.

6.3     The numbers of pictures and sounds do not need specification, since
       they are requested by number by the story file itself.

- 7 -   Extensions to the Format

7.1     One of the advantages of the IFF standard is that extra chunks can be
       added to the format to extend it in various ways. For example, there
       are three standard chunk types defined, namely 'AUTH', '(c) ', and
       'ANNO'.

7.2     'AUTH', '(c) ', and 'ANNO' chunks all contain simple ASCII text
       (all characters in the range 0x20 to 0x7E).

7.2.1   The only indication of the length of this text is the chunk length
       (there is no zero byte termination as in C, for example).

7.2.2   The IFF standard suggests a maximum of 256 characters in this text
       as it may be displayed to the user upon reading, although it could
       get longer if required.

7.3     The 'AUTH' chunk, if present, contains the name of the author or
       creator of the file. This could be a login name on multi-user systems,
       for example. There should only be one such chunk per file.

7.4     The '(c) ' chunk contains the copyright message (date and holder,
       without the actual copyright symbol). This is unlikely to be useful on
       save files. There should only be one such chunk per file.

7.5     The 'ANNO' chunk contains any textual annotation that the user or
       writing program sees fit to include. For save files, interpreters
       could prompt the user for an annotation when saving, and could write
       an ANNO with the score and time for V3 games, or a chunk containing
       the name/version of the interpreter saving it, and many other things.

7.6     The 'ANNO', '(c) ' and 'AUTH' chunks are all user-level information.
       Interpreters must not rely on the presence or absence of these chunks,
       and should not store any internal magic that would not make sense to
       a user in them.

7.7     These chunks should be either ignored or (optionally) displayed to
       the user. '(c) ' chunks should be prefixed with a copyright symbol
       if displayed.

7.8     The save-file may contain interpreter-dependent information. This is
       stored in an 'IntD' chunk, which has format:

7.8.1           4 bytes         'IntD'          chunk ID
7.8.2           4 bytes         n               chunk length
7.8.3           4 bytes         ...             operating system ID
7.8.4           1 byte          000000sc        flags
7.8.5           1 byte          ...             contents ID
7.8.6           2 bytes         0               reserved
7.8.7           4 bytes         ...             interpreter ID
7.8.8           n-12 bytes      ...             data

7.9     The operating system and interpreter IDs are normal IFF 4-character
       IDs in form. Please register IDs used with me <[email protected]>, so
       this can be managed sensibly. They can then be added to future
       versions of this specification, and contents IDs can be assigned.

7.10    If the s flag is set, then the contents are only meaningful on the
       same machine/network on which they were saved. This covers filenames
       and similar things. How to handle checking if this is indeed the same
       machine is an open question, and beyond the scope of this document.
       It is certainly true, however, that if the operating system ID does
       not match the current system and this bit is set, then the chunk
       should not be copied.

7.11    If the c flag is set, the contents should not be copied when loading
       and saving a game--they are only relevant to the exact current
       state of play as stored in the file. The data need not be copied
       even if this flag is clear, but must not be copied if it is set.

7.12    If the interpreter ID is '    ' (four spaces), then the chunk contains
       information useful to *all* interpreters running on a particular
       system. This can store a magical OS-dependent reference to the original
       story file, which need not worry about vagaries of filename handling on
       more than one system. This chunk may contain anything that can be put
       in a file and retrieved intact. If the file is restored on a suitable
       system this can be used to do Good Things.

7.13    If the operating-system ID is '    ', then the chunk contains data
       useful to *all* ports of a particular interpreter. This may or may
       not be useful.

7.14    The interpreter and operating-system IDs may not both be '    '.
       This should not be neccessary.

7.15    If neither ID is '    ', the contents are meaningful only to a
       particular port of a particular interpreter. Save-file specific
       preferences probably fall into this category.

7.16    The contents ID will be defined when chunk IDs are picked. Its
       purpose is to allow multiple chunks to be written containing
       different data, which is necessary if they need different settings
       of the c and s flags.

7.17    These extensions add no overhead to interpreters which choose not to
       handle them, except for larger save files and more chunks to skip
       when reading files written on another program. Interpreters are not
       expected to preserve these optional chunks when files are re-saved,
       although some may be copied, at the option of the interpreter writer
       or user.

7.18    The only required chunks are 'IFhd', either 'CMem' or 'UMem', and
       'Stks'. The total overhead to a save file is 12 bytes plus 8 for each
       chunk; in the minimal case ('IFhd', '[CU]Mem', 'Stks' = 3 chunks), this
       comes to 36 bytes.

7.19    The following operating system IDs have been registered:

7.19.1          'DOS '          MS-DOS (also PC-DOS, DR-DOS)
7.19.2          'MACS'          Macintosh
7.19.3          'UNIX'          Generic UNIX

7.20    The following interpreter IDs have been registered:

7.20.1          'JZIP'          JZIP, the enhanced ZIP by John Holder

7.21    The following extension chunks have been registered to date:

               System ID       Interp ID       Content ID      Section
7.21.1          'MACS'          '    '          0               7.22

7.22    The following chunk has been registered for MacOS, to enable a
       Macintosh interpreter to find a story file given a save file using
       the System 7 ResolveAlias call. The MacOS alias record can be of
       variable size: the actual size can be calculated from the chunk size.
       Aliases are valid only on the same network as they were saved.

7.22.1          4 bytes         'IntD'          chunk ID
7.22.2          4 bytes         n               chunk length (variable)
7.22.3          4 bytes         'MACS'          operating system ID: MacOS
7.22.4          1 byte          00000010        flags (s set; c clear)
7.22.5          1 byte          0               contents ID
7.22.6          2 bytes         0               reserved
7.22.7          4 bytes         '    '          interpreter ID: any
7.22.8          n-12 bytes      ...             MacOS alias record referencing
                                               the story file; from NewAlias

- 8 -   Introduction to the IFF format.

8.1     This is based on the official IFF standards document, which is rather
       long and contains much that is irrelevant to the task in hand. Feel
       free to mail me if there are errors, inconsistencies, or omissions.
       For the inquisitive, a document containing much of the original
       standard, including the philosophy behind the structure, can be found
       at http://www.cica.indiana.edu/graphics/image_specs/ilbm.format.txt

8.2     IFF stands for "Interchange File Format", and was developed by a
       committee consisting of people from Commodore-Amiga, Electronic Arts
       and Apple. It draws strongly on the Macintosh's concept of resources.

8.3     The most fundamental concept in an IFF file is that of a chunk.
8.3.1   A chunk starts with an ID and a length.
8.3.2   The ID is the concatenation of four ASCII characters in the range 0x20
       to 0x7E.
8.3.3   If spaces are present, they must be the last characters (there
       must be no printing characters after a space).
8.3.4   IDs are compared using a simple 32-bit equality test - note that this
       implies case sensitivity.
8.3.5   The length is a 32-bit unsigned integer, stored in big-endian format
       (most significant byte, then second most, and so on).

8.4     After the ID and length, there follow (length) bytes of data.
8.4.1   If length is odd, these are followed by a single zero byte. This byte
       is *not* included in the chunk length, but it is very important, as
       otherwise many 68000-based readers will crash.

8.5     A simple IFF file (such as the ones we will be considering) consists of
       a *single* chunk of type 'FORM'.
8.5.1   The contents of a FORM chunk start with another 4-character ID.
8.5.2   This ID is also the concatenation of four characters, but these
       characters may only be uppercase letters and trailing spaces. This is
       to allow the FORM sub-ID to be used as a filename extension.

8.6     After the sub-ID comes a concatenation of chunks. The interpretation of
       these chunks depends on the FORM sub-ID (in this proposal, the sub-ID
       is 'IFZS'), except that a few chunk types always have the same meaning
       (notably the 'AUTH', '(c) ' and 'ANNO' chunks described in section 7).
       For reference, the other reserved types are: 'FOR[M1-9]', 'CAT[ 1-9]',
       'LIS[T1-9]', 'TEXT', and '    ' (that is, four spaces).

8.7     Each of these chunks may contain as much data as required, in whatever
       format is required.

8.8     Multiple chunks with the same ID may appear; the interpretation of such
       chunks depends on the chunk. For example, multiple ANNO chunks are
       acceptable, and simply refer to multiple annotations. If more than one
       chunk of a certain type is found, when the reader was only expecting
       one, (for example, two 'IFhd' chunks), the later chunks should simply
       be ignored (hopefully with a warning to the user).

8.9     Indeed, skipping is the expected procedure for dealing with any unknown
       or unexpected chunk.

8.10    Certain chunks may be compulsory if the FORM is meaningless without
       them. In this case the 'IFhd', '[CU]Mem' and 'Stks' are compulsory.

- 9 -   Resources available

9.1     A set of patches exists for the Zip and Frotz interpreters, adding
       Quetzal support. They can be obtained from:

               http://www.geocities.com/SiliconValley/Vista/6631/

9.2     A utility, 'ckifzs' is available as C source code to check the
       validity of generated save files. A small set of correct Quetzal
       files are also available. These may be of use in debugging an
       interpreter supporting Quetzal. These may be obtained from the
       web page mentioned in 9.1.

9.3     This document is updated whenever errors are noticed or new extension
       chunks are registered. The latest text version will always be available
       from the above web page. The latest revision designated stable
       (currently version 1.3) will be in the the IF archive, ftp.gmd.de,
       in the directory /if-archive/infocom/interpreters/specification/.

9.4     This document is itself available in a number of forms. The base
       version is this text version, but there is also a PDF version
       (converted by John Holder) and an HTML version (converted by Graham
       Nelson). Links to all of these may be found on the web page.

9.5     A few interpreters support Quetzal; details will appear here as
       they become available.

- 10 -  Credits.

10.1    This standard was created by Martin Frost <[email protected]>. Comments
       and suggestions are always welcome (and any errors in this document
       are entirely my own).

10.2    The following people have contributed with ideas and criticism
       (alphabetic order):

               King Dale               <[email protected]>
               Marnix Klooster         <[email protected]>
               Graham Nelson           <[email protected]>
               Andrew Plotkin          <[email protected]>
               Matthew T. Russotto     <[email protected]>
               Bryan Scattergood       <[email protected]>
               Miron Schmidt           <[email protected]>
               Colin Turnbull          <[email protected]>
               John Wood               <[email protected]>