Document: FSC-0065
Version:  001
Date:     02-Aug-1992




                         Type 3 ASCII:  A proposal
                         =========================

                                Mark Kimes
                             FidoNet 1:380/16




Status of this document:

    This FSC suggests a proposed protocol for the FidoNet(r) community,
    and requests discussion and suggestions for improvements.
    Distribution of this document is unlimited.

    Fido and FidoNet are registered marks of Tom Jennings and Fido
    Software.




Introduction:
============

This document describes a type of mail packet called type 3 ASCII.  Type
3 ASCII was designed with how Fidonet Technology Networks (FTNs) handle
mail (netmail, echomail, groupmail) in mind.  It was also designed to
allow new distribution methods to be introduced.  For instance, it is
possible to combine the best of echomail and groupmail methods using
type 3 ASCII packets.  Finally, type 3 ASCII provides reliability, space
and speed advantages over the current mail packet type 2 (see "Type 3
ASCII vs. Type 2" section below).



Packet structure:
================

(See "Definitions" section below for the meaning of any arcane symbols)

Type 3 ASCII packets and archived bundles will ride existing transport
services (mailers) as attached files.  Type 2 mail and type 3 ASCII mail
can both be sent to a node without conflicts.  Naturally, the receiving
node should be able to process type 3 ASCII mail before it is sent.

Type 3 ASCII packets are named <fileroot><.><3KT> when sent to a remote
site. Archives containing type 3 packets are named <fileroot><.><3?A>
when sent to remote sites.  How these files are stored or named locally
is not within the scope of this document.

A type 3 ASCII packet consists of a packet header, followed by a
carriage return, followed by zero or more messages, followed by a NUL.
A type 3 ASCII message consists of a message header, followed by a
carriage return, followed by zero or more characters of message text,
followed by a NUL.

Diagramatically speaking,

       (Text in brackets [] indicates optional data)

 Type 3 ASCII packet:      header
                           <cr>
                           [messagehdr1
                            <cr>
                            [text]
                            NUL
                            messagehdr2
                            <cr>
                            [text]
                            NUL
                            ...
                            messagehdrn
                            <cr>
                            [text]
                            NUL
                           ]
                           NUL


Breakdown:
=========

       (See "Description of Fields" section below for information on
        individual fields.)

 Packet header:
 =============

   <3ASCII><cr>
   From<cr>
   [To]<cr>
   Creator<cr>
   [Password]<cr>
   [Area]<cr>
   [Tag1<sp>data1<cr>]
   [Tag2[<sp>data2]<cr>]
   ...
   [Tagn[<sp>datan]<cr>]

 Message header:
 ==============

   From<cr>
   [To]<cr>
   [Subject]<cr>
   Date<cr>
   [Area]<cr>
   ID<cr>
   [Ref]<cr>
   [Tag1<sp>data1<cr>]
   [Tag2[<sp>data2]<cr>]
   ...
   [Tagn[<sp>datan]<cr>]


 Message body:
 ============

   Free-flowing, NUL-terminated text.  May be composed of any combination
   of ASCII characters > 31 (from the space character, ASCII character
   32, onward) and may include <cr> as a "paragraph terminator."  Systems
   which display message text should wrap long lines to suit their
   application.

   To be in compliance with this document, implementations must be able
   to forward messages with at least 131,072 (128K) characters of text
   (including the terminating NUL).  Network politics may outlaw
   messages of lesser size, but that is beyond the scope of this
   document.  If a compliant implementation encounters a message longer
   than the 128K limit, it may truncate the message text before
   forwarding.  However, since it is easy to support messages of a
   length limited only by available disk space, it is encouraged that
   you do so and not impose artificial restrictions.  The purpose of
   this limit is to guarantee a minimum size that will be passed, _not_
   to restrict implementations to the "minimum."

   Line feeds (ASCII character 10) are reserved and should not normally
   appear in message text.  Future plans call for their use as "escape
   codes."  So called "soft carriage returns" (ASCII character 141)
   should not be contained in transmitted message text unless the actual
   character itself is desired.

   Tabs (ASCII character 9) should not be used in message text as their
   use often leads to unreadable messages.  How many spaces should be
   used at a remote site to represent them?




Description of Fields:
=====================

 Note:  the maximum length of any field line (excluding, of course,
        message text) is 255 characters including the terminating <cr>.
        In practice, a bit of restraint should be practiced to keep
        fields as small as possible.  The maximum length of any header
        is 32767 bytes, including terminating <cr>.  In practice, this
        limit should never be approached.

 Date:
 ====

   YYYYMMDDhhmmss<optional time zone>

   where
     YYYY = year with century, as in 1991 or 2001
     MM   = month, as in 01 to 12
     DD   = day of month, as in 01 or 28
     hh   = hour of day, as in 00 to 23
     mm   = minute of hour, as in 00 to 59
     ss   = second of minute, as in 00 to 59
     <optional time zone> = offset from GMT in 15 min. increments
                            (i.e. "+4" (sans quotes) for GMT + one
                            hour)

   All numbers are represented in decimal.

   Samples:  19990419143200
               (April 19, 1999 at 2:32:00 pm)
             19921223020303+8
               (December 23, 1922 at 02:03:03 GMT + 2 hours)

   The Date field is required.


 From and To:
 ===========

   The From field contains the writer's name followed by a valid FTN
   network address. For the purposes of this document and current
   implementations of type 3 ASCII packets, the format of a valid FTN
   network address is:

     Domain<#>Zone<:>Net</>Node[<.>Point]

     where

       Domain is a text string from 1 to 8 characters in length
       containing only alphabetical [A-Za-z] and/or numerical [0-9]
       characters.

       Zone is a decimal number from 1 to 65533.

       Net is a decimal number from 1 to 65533.

       Node is a decimal number from 0 to 65533.

       Point is a decimal number from 0 to 65535 (may be omitted if 0).

     The FTSC or whatever body guards tech specs may change this
     definition in the future as it sees fit.

   The full format of a type 3 ASCII From or To field is:

   [User Name<@>]Domain<#>Zone<:>Net</>Node[<.>Point]

   If User Name<@> is missing, assume user name is Sysop.  User Name
   may be composed of any combination of ASCII characters > 31 (from
   the space character, ASCII character 32, onward) excluding <@>.

   If <.>point is missing, assume point 0.

   The To field contains the recepient's name and address as above.
   The To field is optional.  If it is missing, message/packet is
   broadcast mail (no definite, single recipient).  In this case there
   must be an area (if the To field is omitted in the packet header,
   there must be an area in the packet header and all messages must be
   broadcast mail for that area.  If omitted in the message header, the
   message or the packet must have an area and message may be displayed
   as being addressed to "All@Anywhere").  A <cr> must still be present
   as a "space holder."  In broadcast mail, it is permissible to give
   only the name of the user (without following address) in a message
   header; however, the name must end with <@> (to distinguish it from
   an address with no User Name).  Note this means a single broadcast
   mail packet can be sent to many nodes.

   The From field is required.

   In the case of From and To fields in the packet header,
   [user name<@>] is probably unimportant.

   In the interests of saving space, domains such as "Fidonet.org"
   should be replaced with just "Fidonet," as the ".org" modifier has
   no meaning to an FTN site.  Domains should be treated case
   insensitively.

   Sample:   John Doe@Fidonet#1:380/16
               (User "John Doe" in domain "Fidonet" zone 1 net 380
                node 16, implied point 0)


 Creator:
 =======

   Name of the product that produced the packet.  This field is
   required.


 Password:
 ========

   A password to use for security.  This field is optional.  If
   omitted, a <cr> must still be present as a "space holder."  How this
   field is used is implementation-defined.


 Subject:
 =======

   The subject field should contain text hinting at the subject of the
   message text.  It may be composed of any combination of ASCII
   characters > 31 (from the space character, ASCII character 32,
   onward).  The subject field is optional.  If omitted, a <cr> must
   still be present as a "space holder."


 Area:
 ====

   Area fields consist of a string of alphanumeric characters plus
   space, "-" and "_" (ASCII characters 32, 45 and 95 respectively).
   Area fields are optional with the following consequences:

     If the area field in a packet header is missing, the messages in
     the packet will have area fields present for broadcast mail,
     omitted for personal mail.

     If the area field in a packet header is present, all the messages
     in the packet will be broadcast mail for the area specified in the
     packet header.  The message area fields will not be present.

   When an area field is omitted, a <cr> must still be present as a
   "space holder."


 ID:
 ==

   An ID consists of the originating address of the message plus a
   serial number, in the form:

     origaddr<sp>serialno

    The originating address should be specified in a form that
    constitutes a valid return address for the originating network. If
    the originating address is enclosed in double-quotes,  the entire
    string between the beginning and ending double-quotes is considered
    to be the orginating address.  A double-quote character within a
    quoted address is represented by by two consecutive double-quote
    characters.  The serial number may be any eight character
    hexadecimal number,  as long as it is unique - no two messages from
    a given system may have the same serial number within one year. The
    manner in which this serial number is generated is left to the
    implementor.

      Notes:  The "old" format of
                Zone<:>Net</>Node[<.>Point][<@>Domain]
              for FTN addresses is allowed in this field.

              The address portion of the ID may be omitted if it is
              exactly the same as the From address (less User Name<@>).
              In this case, the ID field should begin with a space
              followed immediately by the serial number.

              In the case of foreign network addresses, this address
              gives you the "true" origin, and the From address gives
              you the gateway at which the message entered FTN
              territory.  This allows you to gate replies to "foreign"
              sites.


        Samples:  some.other.net.addr ABCDEF12
                   12345ABC

        (Assume From field of second sample contained
         "Joe Blow@Fidonet#1:380/16",so complete constructed ID would be
         Fidonet#1:380/16 12345ABC
         Note address would be copied exactly from the From field.)

   The ID field is required.


 Ref:
 ===

   A Ref consists of the ID of the original message to which this message
   refers (usually as a reply).

        Sample:   Fidonet#1:380/16 12345ABC
                  (would reference the second ID sample above)

   The reference field is optional.  If omitted, a <cr> must still be
   present as a "place holder."


 Tag<sp>Data:
 ===========

   The tag+data lines are type 3 ASCII's method of automatically
   expanding its headers.  A tag consists of a sequence of uppercase
   alphabetic (A-Z inclusive) and/or numeric sequence of characters and
   possibly a hyphen (ASCII character 45) and/or underline (ASCII
   character 95), up to 12 characters in length (a name).  A tag name
   can be followed optionally by a space (ASCII character 32) and data.
   Data may be composed of any combination of ASCII characters > 31
   (from the space character, ASCII character 32, onward).

   To aid in developer experimentation with tags in type 3 ASCII, it is
   guaranteed that the FTSC or whatever body guards tech specs will
   never "canonize" a tag beginning with the two characters "X-" (ASCII
   character 88 followed immediately by ASCII character 45).  Thus,
   tags may use this combination before tag names to guarantee
   uniqueness.

   Experimental tags may be stripped by conforming implementations
   during message passthrough.  This helps prevent experimental tags
   from escaping from test sites.

     Samples (tag names are invented):

       FOLLOW AFILEN.AME
       X-TAG SOMEDATA
       LONETAG

   Tag<sp>data fields are optional and may be completely omitted when
   creating a packet.  Exception:  all tag<sp>data fields except,
   possibly, experimental fields, should be passed through with a
   message being forwarded.

   Predefined tags:
   ===============

   Tag         Where         Data        Meaning
   ---         -----         ----        -------
   PRIV        Msg Hdr       None        Message is private
   FOROK       Pkt Hdr       None        Packet may be forwarded without
                                         unpacking -- all messages are
                                         to the To: address in the packet
                                         header



Type 3 ASCII vs. Type 2:
=======================

Type 3 ASCII saves between 6% to 11% in raw packet size over type 2
(using Tiny Seenbys with the type 2 packets to make the test as fair as
possible), depending on how area tags for echos are used in the type 3
ASCII packet (in packet header vs. message headers).  7% smaller would
be the norm for the way we do echomail business now.  The tests
conducted were most unscientific but should be close to everyday
echomail-oriented reality.

Compressed packets are a slightly different story.  Type 3 ASCII
compresses the same as type 2 when using area tags for echos in the
message headers.  Type 2 compresses approximately 2.5% better when area
tags are used in the type 3 ASCII packet headers instead.  Either way,
compressed type 3 ASCII packets are smaller than comparable type 2
packets due to the smaller raw packet size.  Even compression ratios
would be the norm for the way we do echomail business now.

Type 3 ASCII imports between 2% to 5% faster (depending on algorithms
used).  There is no discernable difference on export.  Keep in mind that
this particular test has too many variables (software, hardware, relative
efficiency of code, etc.) to be considered a real benchmark.  Most of the
speed savings is in not having to process SEEN-BY and PATH lines.  The
lack of end-of-text control information is a real boon.

Type 2 has no method for reliably obtaining the full 5-D origin address
of a message.  Type 3 ASCII provides a reliable method of obtaining
full origin address information for both the true origin (in whatever
network) and the gateway which brought the message into FTN territory
(if from a foreign network).  This means that even if a message
originated in a network with which your software has no idea how to
communicate, you can still send a reply to an FTN node for gating.

Type 2 has no reliable method for stopping dupes.  Type 3 ASCII has a
mandatory ID field, very similar to type 2's optional MSGID, which can
be used for reliable dupe checking.

Type 2 echomail has control information scattered throughout the message
body, including SEEN-BY and PATH information at the end of the message.
This causes problems for developers, who often opt for fixed-length
buffers and arbitrary message length limits.  All control information
for Type 3 ASCII is in the extensible message header.  Moreover, type 3
ASCII has generous set limits to which programmers can work, and which
users can therefore rely on.



Definitions:
===========

 Except where noted otherwise, numbers are in decimal.

 Although the ASCII character set is normally defined as being limited
 to characters from 0 to 127, this document acknowledges the existence
 of an eighth bit in most bytes and uses the term (loosely) to mean
 characters from 0-255.  Network politics may or may not "outlaw" the
 use of some of those bytes; that is outside the scope of this document.

 Note:  text in brackets [] indicates an optional field.  See
        "Definitions" section below for meaning of text in <>.  See
        "Description of Fields" section below for information on
        individual fields.


 Alphabetic:
 ==========

 A-Z and a-z, ASCII characters 65 to 90 and 97 to 122 inclusive.


 Numeric:
 =======

 0-9, ASCII characters 48 to 57 inclusive.


 Alphanumeric:
 ============

 All characters alphabetic and numeric.


 Hexadecimal:
 ===========

 0-9 and A-F (or a-f), ASCII characters 48 to 57 and 65 to 70 (or 97 to
 102) inclusive.


 NUL:
 ===

   ASCII character 0.


 <cr>:
 ====

   Carriage return, ASCII character 13.

 <lf>:
 ====

   Line feed, ASCII character 10.

 <sp>:
 ====

   Space, ASCII character 32.

 <@>:
 ===

   @, ASCII character 64.


 <#>:
 ===

   #, ASCII character 35.

 <:>:
 ===

   :, ASCII character 58.

 </>:
 ===

   /, ASCII character 47.

 <.>:
 ===

   ., ASCII character 46.


 <3ASCII>:
 ========

   The literal string "3ASCII" (not including quotation marks).  This
   text, followed by a <cr>, identifies a type 3 ASCII packet.
   Implementations should *not* processes a file unless this identifier
   is found on the first line, but should probably log the occurrence.


 <fileroot>:
 ==========

   Eight alphanumeric characters that serve as the "root" of a
   filename.


 <3KT>:
 =====

   The literal string "3KT" (not including quotation marks).


 <3?A>:
 =====

   The literal string "3?A" (not including quotation marks) with the
   question mark (?) being replaced by a decimal integer from 0 to
   9 (ASCII 48 to 57 inclusive).




Miscellaneous notes:
===================

jim nutt invented MSGIDs and REPLYids (ref. FTS-0009), which were lifted
very nearly whole to become IDs and Refs in this document.  Tom Jennings
invented Fido and Fidonet <tm and stuff> from whole cloth and RAM chips.
NET_DEV's continual foolishness inspired me to do instead of whine.
Let's see if this cuts down on the whining...


                               -end-



Mark Kimes
1:380/16.0@Fidonet
(318)222-3455 data
542 Merrick
Shreveport, LA, USA  71104