General overview of the program.

General overview of the program.
(1) We read in the conversion directives and store them.
(2) We read in the font information in the ChiWriter file and match it
up against the information in the conversion directives. A translation
table mapping
ChiWriter file font numbers <---> Conversion file font numbers
is built.
(3) We read in the ChiWriter document, one line at a time, into a buffer.
A ChiWriter line contains several super/subscripts. They are put into
a rectangular buffer (or rather 2 parallel ones, one for font, one
for character code.)

+---------------------------------+
| 2 2 2�i/n |
| x + y = 1 if � = e and | <-baseRow
| i i | <-lastRow
+---------------------------------+

(4) Only looking at the buffer rectangle, we convert and write a line at
a time.
First, we search the block substitutions.

After that we search for indices boundaries. Indices of current level
must lie in rectangle, the left side of which is the leftmost column with
non-empty symbol in it above or below the current baserow, from up- and
down-sides it is bounded by the boundaries of rectangle of previous level
and the current baserow. The right boundary is whether before the first
symbol after the spaces in the current baserow or is defined by attributes
of the letters in the current baserow. If the leftmost column in this
rectangle contains only one symbol this row becomes the current baserow
in this new rectangle. Else these rectangle is being splitting on the
narrower ones along the empy rows, if this option is specified and this
is possible. Simultaneously with the search for indices boundaries the
stacked symbols are searched.

Syntax of Chiwriter Files

\+ superscript row follows
\- subscript row follows
\= end of text block

\Hx, \Fx header/footer, x=D for default, E for even, 1..9 for page 1..9,
\S separator
\Nx footnote text
\Uxname x=1..0, !..*, name = font name
using font# x for the font with the given name

\0 ... \9 font change (1-10)
\! ... \* font change (11-20)
\ soft space
\, hard return
\/ no/soft page break
\G graphics
\A after soft hyphen
\\ backslash
\@ page number
\^ expanding marker (for centering)
\[ tab
\]x reverse tab
\Fx footnote

-------------------------------------------------------------------------*/

Syntax of Conversion Map file

Each line of the map file contains a command, starting with a 2 letter
command code. The code is followed by arguments, depending on the command.

CH ch out-string [a] [b] [c] [M] [m] [C nnn]
[P before-string after-string] [B before-string after-string] [N]
a b c option govern the indices boundary recognition.
a - search after
b - search before
c - don't search between two letters with this option
(even if there is a or b option)
P and B options govern the search of stacked symbols.
P - if this symbol is over another (which is in the
temporary baseline of indices search) and inside
the current rectangular of this search, then insert
before-string and after-string around the symbol
in the baseline.
B - the same with "under".
N - don't search stacked symbols above and under this symbol.
C nnn - denote the category of this
character relative to word-searching algorithm.
* 0 | l | L means letter
* 1 | , means punctuation (non-.)
* 2 | ? means unknown
* 3 | + means math (default <=>+)
* 4 | . means dot
* 5 | ) means )
* 6 | n | N means nonmath (default ")
* 7 | ! means non-math letter (as �)
* 8 | ' means ' (effects on the single letters
* compare I'm, I'd)
* 9 | - means -
M(ultiline), m(ultiline) - for Begin and end
of multiline search

Block substitution is allowed. It is done BEFORE the sub- and super-script
search, so it affect one. The search is made by left-upper angle,
(or by explicitely specified ^ point)
so it is done recursively from left to right and from top to bottom,
but without back step. The font of key letter (from the left-upper
angle) must be specified exactly.

Syntax is (FOR BLOCK 3x3)

BB [a] [b] [nnnn]
B1 from11 from12 from13
B1 from21 from22 from23
B1 from31 from32 from33
B2 to11 to12 to13
B2 to21 to22 to23
B2 to31 to32 to33
BE

where from/toIJ is SP (for space) || * (for everything)
|| CH ft ch || CH * ch || FT ft#

In block descriptor is possible to use
^ (modifier in the first part
and fild in the second).
In the first part of descriptor it denote
the letter from which the search must begin,
in the second the substitution of letter, denote
in the first by ^ modifier.

After BB you can specify options [a] [b] [nnnn]
a denotes that baseline must be after block
b denotes that baseline must be before block
nnnn denotes that baseline must be in the nnnn'th line of the block

RC ch - character to denote Fakereturn

TC ch - character to denote FakeTAB

RT CH ft# ch - character to replace HardReturn (will be decoded later)

TA CH ft# ch - character to replace HardTAB (will be decoded later)

CT CH ft# ch - character to replace page counter (will be decoded later)

CE CH ft# ch - character to replace centering sign (will be decoded later)

SE str1 str2 - strings to surround the separator block

FR str1 str2
str3 - strings to surround the footer block (str2 is inserted after
footernumber)

HE str1 str2
str3 - strings to surround the header block (str2 is inserted after
headernumber)

FN str1 str2 str3 ft#
- the same as for header-footer and
font to replace reference (will be decoded later)

TX str1 str2 - strings to surround the text type font block inside math

MU str1 str2 str3
- strings to begin, separate rows and end multi-row deciphering
session (which begins and ends
with attribute 'M','m' characters)

MB str1 str2 str3
- strings to begin, separate rows and end multi-row deciphering
session in subscripts

MP str1 str2 str3
- strings to begin, separate rows and end multi-row deciphering
session in superscripts

MC nnn - maximal length of block substitution cycle at one location

FT xx in-string out-string [ t | p ]
't' denote that this pattern is text pattern and
in math surround must be surrounded by additional
TX attributes
'p' is for phantom font, that can be later translated
into math or text

VE ... - this line will be printed

; ... - comment line - not inside the block!

SP ch - for fakeSPACE symbol

EM ch - for fakeEmpty symbol, that denotes empty string

ES ch - for Escape symbol, that denotes begin and end of special fonts

FO ... in FO strings multiple reference to patterns is admissible

AS ft1 ft2 - makes characters substitution table in ft1 as in ft2
(no changes, please! - they refer to the same table)

LL xx - for linelength

MA xx - no of font with plain in- and out-strings for math

OR xx - no of font without in- and out-strings for ends of paragraphs

RN ftfrom ftto - to rename fontfrom in math words (i.e., in short words
outside nonmath list) to ftto

NW - to input NEW nonmath list - for new font

NM word - nonmath list (in alphabetical order)

EN - end of nonmath list

TI ft1 ch1 ft2 ch2 ... - list of tie characters - effects on punctuation
mark and short words deciphering

Notes.

FO fontname font# ...
to declare a font. The numbers should be between 1 and 20.
They are used in other directives to avoid constantly writing out the full font
name. If the numbers differ from the order of fonts in the ChiWriter
file, they are automatically rearranged.
e.g. FO ITALIC 3
FO GREEK 7
FO LINEDRAW 8
FO MATHII 10
STANDARD should always be 1.

CH font# char-replacement ...
e.g. CH 7 a \alpha
to replace all Greek "a" by "\alpha"
If the font number is 0, the replacement is done for all fonts.

FT font# On-command Off-command ...
to toggle a font on/off.
e.g. FT 3 \it \rm

To embed a blank into a TeX code word, use a � (ASCII 254). (or specified by SP
directive character). Fascinating trivia fact: On some European keyboards, this
code is produced when hitting the "umlaut" key and [Space].