****************************************************************************
                  UNIF file format specifications
                (Universal NES Image) file format

       Created by Tennessee Carmel-Veilleux ([email protected])
                   REV 7b, November 28th, 2000


***********THIS IS AN OPEN STANDARD. IT IS OPEN TO SUGGESTIONS**************

Overview
--------
The UNIF format is a portable, flexible REPLACEMENT of the NES standard
(Designed by Marat Fayzullin). It is a chunked file format in the lines of
the Amiga IFF (LBM), Microsoft RIFF (WAV) and Autodesk 3D studio mesh
files (3DS). The goal of having a chunked definition is to provide
flexibility and ease of implementation, as data is described in blocks
with type IDs referring to them and header information to provide a
selective data reading. The format uses symetric data conversion for
numerical compatibility between the different platforms' byte ordering.
The ordering used is the 6502 Byte order (Intel), so that no more
bickering arrises from people who do not appreciate Network Byte Ordering
(Motorola).

***
The extension suggested for use with this format is .UNF (.UNIF under Operating
systems which support longer extensions).
***

Byte ordering
-------------
Byte ordering used throughout the file for DWORDs and WORDs is 6502 Byte
order. The 6502 byte order is LSB, least significant byte first
(Little-endian):

3 2 1 0 <- Byte order for MSB (Network Byte order, m68k, PowerPC)
0 1 2 3 <- Byte order for LSB (80x86, 6502, Z80)

Care must be taken to convert the WORDs and DWORDs if you are using a
non-LSB platform (Mac, Amiga, some others).

File format
-----------
00h-1Fh : Header
20h-EOF : Chunks

I can not think of an easier format to describe :)

Header
------
The 4-byte header contains the string "UNIF" , NON null-terminated, case
as written. This is for identification purposes, a little like .NES' "NES^Z"
It is followed by Revision number and reserved bytes.

Format: 00h-03h: "UNIF" tag identifier
       04h-07h: DWORD -> Revision number, currently 4
       08h-1Fh: Reserved for future usage

Sample structure:

structure UNIF_header [
  char identification[4]; /* MUST be "UNIF" */
  dword revision; /* Revision number */
  byte expansion[24];
];

Chunks
------
Each chunks is composed of 3 distinct elements:
       00h-03h: Chunk ID string
       04h-07h: DWORD -> Block Length of Data
       08h-?? : Data

All the chunks are written sequentially in the file. If you do not understand
a chunk by its ID, simply jump over it using the data length information.
*** ALL THE CHUNKS ARE OPTIONAL ***
That means that there are NO mandatory chunks, and you support only the
ONES YOU WISH, passing over the others while you are interpreting the
file.

Sample structure:

structure UNIF_chunk [
  char chunk_ID[4]; /* Chunk identification string. Can also be considered a
                       number. ASCII format */
  dword length; /* Data length, in little-endian format */
];

The different chunks:
---------------------
*******************************************************************************
How chunks are described:
ID field: Contains the 4-characters string identifier for the chunk

Length: Length of the block. If it is "??", then the block may have
       variable data size depending on the cartridge.

Revision: First revision in which the chunk appeared. If your reader supports
         a lower revision, you might be unable to read the chunk, simply pass
         over it. The Revision used by the cart is written in the header. The
         number represents the Revision Number of the most recent chunk
         contained in the file. Example : If you have 5 chunks of revision 1
         and 2 chunks of revision 4 in the file, the Revision number in the
         header will be 4.

Description: Complete description of the contents and encoding of the chunk
*******************************************************************************

ID: [MAPR]
Length: ?? (Suggested max: 32 chars)
Revision: 1
Description: This is supplemental information about the mapper. DO NOT USE
            A MAPPER NUMBER HERE ! Rather use the BOARD NAME. There is
            already a list in progress describing each NES cart and the
            board it uses. The string is NULL-TERMINATED.

examples: N,R,O,M,,0 -> This "No mapper"
         U,N,R,O,M,0 -> This is (LS161+LS32)

NOTA: This mapper organization suggests that emulators must be rewritten
to emulate the ACTUAL CARTRIDGE HARDWARE, and not solely a case of another
mapper number. That means you have to make for UNROM:
1- Mapper handler (74LS161+74LS32)
2- CHR-RAM Handler

Those components can be reused, since many boards only have a slight
difference in them compared to similar ones.

**IT SHOULD BE NOTED THAT**: A board name tells you EVERYTHING there is to
know about a board. You do not need other chunks to tell you how much RAM
there is, or if it is VRAM. It is all implied by the board name. A list
will soon be distributed containing board name information.

Address of board table for North American Games and Board Names description:
http://www.parodius.com/~veilleux/boardtable.txt
http://www.parodius.com/~veilleux/boardnames

ID: [READ]
Length: ??
Revision: 1
Description: Commentaries for the user of the ROM image. In the case of a
homebrew game, this can be very useful to store credit and maker
information. *** This could be "Incitation to littering". Please do not
put garbage in there. It is meant for either mapper information or
licensing information for homebrew games.***


ID: [NAME]
Length: ??
Revision: 1
Description: NULL-terminated string containing the name of the game. If not
            present, use the filename as the name.

ID: [TVCI]
Length: BYTE
Revision: 6
Description:  Television Standards Compatability Information set to:
0- Originally NTSC cartridge
1- Originally PAL cartridge
2- Does not matter
NOTE: ALL North American carts that are dumps of the North American
Version are NTSC. All licensed famicom games are NTSC.

ID: [DINF]
Length: 204
Revision: 2
Description: Dumper information block:
structure dumper_info [

  char dumper_name[100]; /* NULL-terminated string containing the name
                             of the person who dumped the cart. */
  byte day; /* Day of the month when cartridge was dumped */
  byte month; /* Month of the year when cartridge was dumped */
  word year; /* Year during which the cartridge was dumped */
  char dumper_agent[100]; /* NULL-terminated string containing the name of
                             the ROM-dumping means used */
]

ID: [CTRL]
Length: BYTE
Revision: 7
Description: Bitfield containing information about the controllers used by the
cartridge.

Bit 0: Regular Joypad
Bit 1: Zapper
Bit 2: R.O.B
Bit 3: Arkanoid Controller
Bit 4: Power Pad
Bit 5: Four-Score adapter
Bit 6: Expansion (Do not touch)
Bit 7: Expansion (Do not touch)

ID: [PCK0] through [PCKF]
Length: DWORD
Reivision: 5
Description: This block contains a 32-bit CRC which can be used to make
sure that the ROM content matches a checksum when burning on EPROM. This
block provides a checksum for [PRG0] through [PRGF] inclusively

ID: [CCK0] through [CCKF]
Length: DWORD
Reivision: 5
Description: This block contains a 32-bit CRC which can be used to make
sure that the ROM content matches a checksum when burning on EPROM. This
block provides a checksum for [CHR0] through [CHRF] inclusively

ID: [PRG0] through [PRGF]
Length: ??
Revision: 4
Description: Chunks containing the Binary data of the PRG ROM. If there
are more than 1 PRG chips on the PRG bus, use PRG1, PRG2, PRG4, etc.
The way PRGs are handled depends on the mapper and emulator. Most generaly
(99%), only use PRG0. (Some carts have been witnessed with 8 PRG ROMs).

ID: [CHR0] through [CHRF]
Length: ??
Revision: 4
Description: Chunks containing the binary data of the CHR ROM. If there
are more than 1 CHR chips on the CHR bus, use CHR1, CHR2, CHR4, etc. The
way CHRs are handled depends on the mapper and emulator. Most generaly
(99%), only CHR0 is used.

ID: [BATR]
Length: BYTE
Revision: 5
Description: The presence of this block indicates that the board indeed
contains a battery. This is necessary because many boards have the
capability of a battery (the traces and holes are there), but they only
use RAM and don't add the battery at manufacturing time. Examples:
* SAROM: MMC1B, PRG ROM, CHR ROM, optional 8k of RAM (battery)
* SKROM: MMC1B, PRG ROM, CHR ROM, 8k optional RAM (battery)

Both these boards (SAROM and SKROM) can have a battery, but usually they
don't have it.

ID: [VROR]
Length: BYTE
Revision: 5
Description: This is a VRAM Override. If this chunk is present, then the
CHR-ROM area will be considered as RAM even if ROM is present. This
overrides board identification. This is present so that homemade carts
which use NROM or others and replace the CHR-ROM with CHR-RAM can still be
interpreted (since NROM is always CHR-ROM in commercial games).

ID: [MIRR]
Length: BYTE
Revision: 5
Description: This chunk tells you how the hardwired mirroring is setup on
the board. The board name CANNOT tell you (in most cases) what the
mirroring is, since the all have solder pads to select the mirroring at
manufacturing time. The following values are legal:

* $00 - Horizontal Mirroring (Hard Wired)
* $01 - Vertical Mirroring (Hard Wired)
* $02 - Mirror All Pages From $2000 (Hard Wired)
* $03 - Mirror All Pages From $2400 (Hard Wired)
* $04 - Four Screens of VRAM (Hard Wired)
* $05 - Mirroring Controlled By Mapper Hardware

Conclusion
----------
This ends the specification for Revision 6 of UNIF. If you have ANY
suggestions to make regarding the UNIF file format, such as chunk ideas or
modifications, or would like to collaborate to the elaboration and design
process, e-mail me at [email protected].

A multi-platform C Code Library for UNIF support was made by Evan Teran. It
is available at http://www.pretendo.org/~proxy/.

[References]
{.NES file format specifications} by Marat Fayzullin ([email protected])
{NESDEV mailing list} by Various authors
{NES technical documentation} by Jeremy Chadwick ([email protected])

[Credits]
Neal Tew for his neat emulator and great contribution to the NESdev
community.

Jeremy Chadwick ([email protected]) for his contribution to the NESdev
community and great advice over the time.

Mark Knibbs ([email protected]) for his excellent web site as well as his
more than honorable contribution to the NES world

Matthew Conte ([email protected]) for his CajoNES and Nofrendo
programs

Michael Iwaniec ([email protected]) for his interest in UNIF and
constructive criticism.

Kevin Horton ([email protected]) for his proposals and support of
UNIF. He's also been a fantastic help to me in my learning curve of the
NES's hardware aspects.

/Firebug/ ([email protected]) for the ideas brought with NIFF

T. Alex Reed {W1k} ([email protected]) for suggestions of additions
to UNIF. He also pointed out some mistakes in the original specifications.

Evan Teran {PrOxY} ([email protected]) for making suggestions as well as
writing a .NES->UNIF converter and the LIB_UNIF library.