LET'S BUILD A COMPILER!

                               By

                    Jack W. Crenshaw, Ph.D.

                          3 June 1989


                Part XI: LEXICAL SCAN REVISITED


*****************************************************************
*                                                               *
*                        COPYRIGHT NOTICE                       *
*                                                               *
*   Copyright (C) 1989 Jack W. Crenshaw. All rights reserved.   *
*                                                               *
*****************************************************************


INTRODUCTION

I've got some  good news and some bad news.  The bad news is that
this installment is  not  the  one  I promised last time.  What's
more, the one after this one won't be, either.

The good news is the reason for this installment:  I've  found  a
way  to simplify and improve the lexical  scanning  part  of  the
compiler.  Let me explain.


BACKGROUND

If  you'll remember, we talked at length  about  the  subject  of
lexical  scanners in Part VII, and I left you with a design for a
distributed scanner that I felt was about as simple  as  I  could
make it ... more than most that I've  seen  elsewhere.    We used
that idea in Part X.  The compiler structure  that  resulted  was
simple, and it got the job done.

Recently, though, I've begun  to  have  problems, and they're the
kind that send a message that you might be doing something wrong.

The  whole thing came to a head when I tried to address the issue
of  semicolons.  Several people have asked  me  about  them,  and
whether or not KISS will have them separating the statements.  My
intention has been NOT to  use semicolons, simply because I don't
like them and, as you can see, they have not proved necessary.

But I know that many of you, like me, have  gotten  used to them,
and so  I  set  out  to write a short installment to show you how
they could easily be added, if you were so inclined.

Well, it  turned  out  that  they weren't easy to add at all.  In
fact it was darned difficult.

I guess I should have  realized that something was wrong, because
of the issue  of  newlines.    In the last couple of installments
we've addressed that issue,  and  I've shown you how to deal with
newlines with a  procedure called, appropriately enough, NewLine.
In  TINY  Version  1.0,  I  sprinkled calls to this procedure  in
strategic spots in the code.

It  seems  that  every time I've addressed the issue of newlines,
though,  I've found it to be tricky,  and  the  resulting  parser
turned out to be quite fragile ... one addition or  deletion here
or  there and things tended to go to pot.  Looking back on it,  I
realize that  there  was  a  message  in  this that I just wasn't
paying attention to.

When I tried to add semicolons  on  top of the newlines, that was
the last straw.   I ended up with much too complex a solution.  I
began to realize that something fundamental had to change.

So,  in  a  way this installment will cause us to backtrack a bit
and revisit the issue of scanning all over again.    Sorry  about
that.  That's the price you pay for watching me  do  this in real
time.  But the new version is definitely an improvement, and will
serve us well for what is to come.

As  I said, the scanner we used in Part X was about as simple  as
one can get.  But anything can be improved.   The  new scanner is
more like the classical  scanner,  and  not  as simple as before.
But the overall  compiler  structure is even simpler than before.
It's also more robust, and easier to add  to  and/or  modify.   I
think that's worth the time spent in this digression.  So in this
installment, I'll be showing  you  the  new  structure.  No doubt
you'll  be  happy  to  know  that, while the changes affect  many
procedures, they aren't very profound  and so we lose very little
of what's been done so far.

Ironically, the new scanner  is  much  more conventional than the
old one, and is very much like the more generic scanner  I showed
you  earlier  in  Part VII.  Then I started trying to get clever,
and I almost clevered myself clean out of business.   You'd think
one day I'd learn: K-I-S-S!


THE PROBLEM

The problem begins to show  itself in procedure Block, which I've
reproduced below:


{--------------------------------------------------------------}
{ Parse and Translate a Block of Statements }

procedure Block;
begin
  Scan;
  while not(Token in ['e', 'l']) do begin
     case Token of
      'i': DoIf;
      'w': DoWhile;
      'R': DoRead;
      'W': DoWrite;
     else Assignment;
     end;
     Scan;
  end;
end;
{--------------------------------------------------------------}


As  you   can  see,  Block  is  oriented  to  individual  program
statements.  At each pass through  the  loop, we know that we are
at  the beginning of a statement.  We exit the block when we have
scanned an END or an ELSE.

But suppose that we see a semicolon instead.   The  procedure  as
it's shown above  can't  handle that, because procedure Scan only
expects and can only accept tokens that begin with a letter.

I  tinkered  around for quite awhile to come up with a  fix.    I
found many possible approaches, but none were very satisfying.  I
finally figured out the reason.

Recall that when we started with our single-character parsers, we
adopted a convention that the lookahead character would always be
prefetched.    That   is,   we  would  have  the  character  that
corresponds to our  current  position in the input stream fetched
into the global character Look, so that we could  examine  it  as
many  times  as  needed.    The  rule  we  adopted was that EVERY
recognizer, if it found its target token, would  advance  Look to
the next character in the input stream.

That simple and fixed convention served us very well when  we had
single-character tokens, and it still does.  It would make  a lot
of sense to apply the same rule to multi-character tokens.

But when we got into lexical scanning, I began  to  violate  that
simple rule.  The scanner of Part X  did  indeed  advance  to the
next token if it found an identifier or keyword, but it DIDN'T do
that if it found a carriage return, a whitespace character, or an
operator.

Now, that sort of mixed-mode  operation gets us into deep trouble
in procedure Block, because whether or not the  input  stream has
been advanced depends upon the kind of token we  encounter.    If
it's  a keyword or the target of  an  assignment  statement,  the
"cursor," as defined by the contents of Look,  has  been advanced
to  the next token OR to the beginning of whitespace.  If, on the
other  hand,  the  token  is  a  semicolon,  or if we have hit  a
carriage return, the cursor has NOT advanced.

Needless to say, we can add enough logic  to  keep  us  on track.
But it's tricky, and makes the whole parser very fragile.

There's a much  better  way,  and  that's just to adopt that same
rule that's worked so well before, to apply to TOKENS as  well as
single characters.  In other words, we'll prefetch tokens just as
we've always done for  characters.   It seems so obvious once you
think about it that way.

Interestingly enough, if we do things this way  the  problem that
we've had with newline characters goes away.  We  can  just  lump
them in as  whitespace  characters, which means that the handling
of  newlines  becomes  very trivial, and MUCH less prone to error
than we've had to deal with in the past.


THE SOLUTION

Let's  begin  to  fix  the  problem  by  re-introducing  the  two
procedures:

{--------------------------------------------------------------}
{ Get an Identifier }

procedure GetName;
begin
  SkipWhite;
  if Not IsAlpha(Look) then Expected('Identifier');
  Token := 'x';
  Value := '';
  repeat
     Value := Value + UpCase(Look);
     GetChar;
  until not IsAlNum(Look);
end;


{--------------------------------------------------------------}
{ Get a Number }

procedure GetNum;
begin
  SkipWhite;
  if not IsDigit(Look) then Expected('Number');
  Token := '#';
  Value := '';
  repeat
     Value := Value + Look;
     GetChar;
  until not IsDigit(Look);
end;
{--------------------------------------------------------------}


These two procedures are  functionally  almost  identical  to the
ones  I  showed  you in Part VII.  They each  fetch  the  current
token, either an identifier or a number, into  the  global string
Value.    They  also  set  the  encoded  version, Token,  to  the
appropriate code.  The input  stream is left with Look containing
the first character NOT part of the token.

We  can do the same thing  for  operators,  even  multi-character
operators, with a procedure such as:


{--------------------------------------------------------------}
{ Get an Operator }

procedure GetOp;
begin
  Token := Look;
  Value := '';
  repeat
     Value := Value + Look;
     GetChar;
  until IsAlpha(Look) or IsDigit(Look) or IsWhite(Look);
end;
{--------------------------------------------------------------}

Note  that  GetOp  returns,  as  its  encoded  token,  the  FIRST
character of the operator.  This is important,  because  it means
that we can now use that single character to  drive  the  parser,
instead of the lookahead character.

We need to tie these  procedures together into a single procedure
that can handle all three  cases.  The  following  procedure will
read any one of the token types and always leave the input stream
advanced beyond it:


{--------------------------------------------------------------}
{ Get the Next Input Token }

procedure Next;
begin
  SkipWhite;
  if IsAlpha(Look) then GetName
  else if IsDigit(Look) then GetNum
  else GetOp;
end;
{--------------------------------------------------------------}


***NOTE  that  here  I have put SkipWhite BEFORE the calls rather
than after.  This means that, in general, the variable  Look will
NOT have a meaningful value in it, and therefore  we  should  NOT
use it as a test value for parsing, as we have been doing so far.
That's the big departure from our normal approach.

Now, remember that before I was careful not to treat the carriage
return (CR) and line  feed  (LF) characters as white space.  This
was  because,  with  SkipWhite  called  as the last thing in  the
scanner, the encounter with  LF  would  trigger a read statement.
If we were on the last line of the program,  we  couldn't get out
until we input another line with a non-white  character.   That's
why I needed the second procedure, NewLine, to handle the CRLF's.

But now, with the call  to SkipWhite coming first, that's exactly
the behavior we want.    The  compiler  must know there's another
token coming or it wouldn't be calling Next.  In other words,  it
hasn't found the terminating  END  yet.  So we're going to insist
on more data until we find something.

All this means that we can greatly simplify both the  program and
the concepts, by treating CR and LF as whitespace characters, and
eliminating NewLine.  You  can  do  that  simply by modifying the
function IsWhite:


{--------------------------------------------------------------}
{ Recognize White Space }

function IsWhite(c: char): boolean;
begin
  IsWhite := c in [' ', TAB, CR, LF];
end;
{--------------------------------------------------------------}


We've already tried similar routines in Part VII,  but  you might
as well try these new ones out.  Add them to a copy of the Cradle
and call Next with the following main program:


{--------------------------------------------------------------}
{ Main Program }

begin
  Init;
  repeat
     Next;
     WriteLn(Token, ' ', Value);
  until Token = '.';
end.
{--------------------------------------------------------------}


Compile  it and verify that you can separate  a  program  into  a
series of tokens, and that you get the right  encoding  for  each
token.

This ALMOST works,  but  not  quite.    There  are  two potential
problems:    First,  in KISS/TINY almost all of our operators are
single-character operators.  The only exceptions  are  the relops
>=, <=, and <>.  It seems  a  shame  to  treat  all  operators as
strings and do a  string  compare,  when  only a single character
compare  will  almost  always  suffice.   Second, and  much  more
important, the  thing  doesn't  WORK  when  two  operators appear
together, as in (a+b)*(c+d).  Here the string following 'b' would
be interpreted as a single operator ")*(."

It's possible to fix that problem.  For example,  we  could  just
give GetOp a  list  of  legal  characters, and we could treat the
parentheses as different operator types  than  the  others.   But
this begins to get messy.

Fortunately, there's a  better  way that solves all the problems.
Since almost  all the operators are single characters, let's just
treat  them  that  way, and let GetOp get only one character at a
time.  This not only simplifies GetOp, but also speeds  things up
quite a  bit.    We  still have the problem of the relops, but we
were treating them as special cases anyway.

So here's the final version of GetOp:


{--------------------------------------------------------------}
{ Get an Operator }

procedure GetOp;
begin
  SkipWhite;
  Token := Look;
  Value := Look;
  GetChar;
end;
{--------------------------------------------------------------}


Note that I still give the string Value a value.  If you're truly
concerned about efficiency, you could leave this out.  When we're
expecting an operator, we will only be testing  Token  anyhow, so
the  value of the string won't matter.  But to me it seems to  be
good practice to give the thing a value just in case.

Try  this  new  version with some realistic-looking  code.    You
should  be  able  to  separate  any program into  its  individual
tokens, with the  caveat  that the two-character relops will scan
into two separate tokens.  That's OK ... we'll  parse  them  that
way.

Now, in Part VII the function of Next was combined with procedure
Scan,  which  also  checked every identifier against  a  list  of
keywords and encoded each one that was found.  As I  mentioned at
the time, the last thing we would want  to  do  is  to use such a
procedure in places where keywords  should not appear, such as in
expressions.  If we  did  that, the keyword list would be scanned
for every identifier appearing in the code.  Not good.

The  right  way  to  deal  with  that  is  to simply separate the
functions  of  fetching  tokens and looking for  keywords.    The
version of Scan shown below  does NOTHING but check for keywords.
Notice that it operates on the current token and does NOT advance
the input stream.


{--------------------------------------------------------------}
{ Scan the Current Identifier for Keywords }

procedure Scan;
begin
  if Token = 'x' then
     Token := KWcode[Lookup(Addr(KWlist), Value, NKW) + 1];
end;
{--------------------------------------------------------------}


There is one last detail.  In the compiler there are a few places
that we must  actually  check  the  string  value  of  the token.
Mainly, this  is done to distinguish between the different END's,
but there are a couple  of  other  places.    (I  should  note in
passing that we could always  eliminate the need for matching END
characters by encoding each one  to a different character.  Right
now we are definitely taking the lazy man's route.)

The  following  version  of MatchString takes the  place  of  the
character-oriented Match.  Note that, like Match, it DOES advance
the input stream.


{--------------------------------------------------------------}
{ Match a Specific Input String }

procedure MatchString(x: string);
begin
  if Value <> x then Expected('''' + x + '''');
  Next;
end;
{--------------------------------------------------------------}


FIXING UP THE COMPILER

Armed with these new scanner procedures, we can now begin  to fix
the compiler to  use  them  properly.   The changes are all quite
minor,  but  there  are quite a  few  places  where  changes  are
necessary.  Rather than  showing  you each place, I will give you
the general idea and then just give the finished product.


First of all, the code for procedure Block doesn't change, though
its function does:


{--------------------------------------------------------------}
{ Parse and Translate a Block of Statements }

procedure Block;
begin
  Scan;
  while not(Token in ['e', 'l']) do begin
     case Token of
      'i': DoIf;
      'w': DoWhile;
      'R': DoRead;
      'W': DoWrite;
     else Assignment;
     end;
     Scan;
  end;
end;
{--------------------------------------------------------------}


Remember that the new version of Scan doesn't  advance  the input
stream, it only  scans  for  keywords.   The input stream must be
advanced by each procedure that Block calls.

In general, we have to replace every test on Look with  a similar
test on Token.  For example:


{---------------------------------------------------------------}
{ Parse and Translate a Boolean Expression }

procedure BoolExpression;
begin
  BoolTerm;
  while IsOrOp(Token) do begin
     Push;
     case Token of
      '|': BoolOr;
      '~': BoolXor;
     end;
  end;
end;
{--------------------------------------------------------------}


In procedures like Add, we don't  have  to use Match anymore.  We
need only call Next to advance the input stream:


{--------------------------------------------------------------}
{ Recognize and Translate an Add }

procedure Add;
begin
  Next;
  Term;
  PopAdd;
end;
{-------------------------------------------------------------}


Control  structures  are  actually simpler.  We just call Next to
advance over the control keywords:


{---------------------------------------------------------------}
{ Recognize and Translate an IF Construct }

procedure Block; Forward;

procedure DoIf;
var L1, L2: string;
begin
  Next;
  BoolExpression;
  L1 := NewLabel;
  L2 := L1;
  BranchFalse(L1);
  Block;
  if Token = 'l' then begin
     Next;
     L2 := NewLabel;
     Branch(L2);
     PostLabel(L1);
     Block;
  end;
  PostLabel(L2);
  MatchString('ENDIF');
end;
{--------------------------------------------------------------}


That's about the extent of the REQUIRED changes.  In  the listing
of TINY  Version  1.1  below,  I've  also  made a number of other
"improvements" that  aren't really required.  Let me explain them
briefly:

(1)  I've deleted the two procedures Prog and Main, and combined
     their functions into the main program.  They didn't seem to
     add  to program clarity ... in fact  they  seemed  to  just
     muddy things up a little.

(2)  I've  deleted  the  keywords  PROGRAM  and  BEGIN  from the
     keyword list.  Each  one  only occurs in one place, so it's
     not necessary to search for it.

(3)  Having been  bitten  by  an  overdose  of  cleverness, I've
     reminded myself that TINY  is  supposed  to be a minimalist
     program.  Therefore I've  replaced  the  fancy  handling of
     unary minus with the dumbest one I could think of.  A giant
     step backwards in code quality, but a  great simplification
     of the compiler.  KISS is the right place to use  the other
     version.

(4)  I've added some  error-checking routines such as CheckTable
     and CheckDup, and  replaced  in-line code by calls to them.
     This cleans up a number of routines.

(5)  I've  taken  the  error  checking  out  of  code generation
     routines  like Store, and put it in  the  parser  where  it
     belongs.  See Assignment, for example.

(6)  There was an error in InTable and Locate  that  caused them
     to search all locations  instead  of  only those with valid
     data  in them.  They now search only  valid  cells.    This
     allows us to eliminate  the  initialization  of  the symbol
     table, which was done in Init.

(7)  Procedure AddEntry now has two  arguments,  which  helps to
     make things a bit more modular.

(8)  I've cleaned up the  code  for  the relational operators by
     the addition of the  new  procedures  CompareExpression and
     NextExpression.

(9)  I fixed an error in the Read routine ... the  earlier value
     did not check for a valid variable name.


CONCLUSION

The resulting compiler for  TINY  is given below.  Other than the
removal  of  the  keyword PROGRAM, it parses the same language as
before.    It's  just  a  bit cleaner, and more importantly  it's
considerably more robust.  I feel good about it.

The next installment will be another  digression:  the discussion
of  semicolons  and  such that got me into this mess in the first
place.  THEN we'll press on  into  procedures and types.  Hang in
there with me.  The addition of those features will go a long way
towards removing KISS from  the  "toy  language" category.  We're
getting very close to being able to write a serious compiler.


TINY VERSION 1.1


{--------------------------------------------------------------}
program Tiny11;

{--------------------------------------------------------------}
{ Constant Declarations }

const TAB = ^I;
     CR  = ^M;
     LF  = ^J;

     LCount: integer = 0;
     NEntry: integer = 0;


{--------------------------------------------------------------}
{ Type Declarations }

type Symbol = string[8];

    SymTab = array[1..1000] of Symbol;

    TabPtr = ^SymTab;


{--------------------------------------------------------------}
{ Variable Declarations }

var Look : char;             { Lookahead Character }
   Token: char;             { Encoded Token       }
   Value: string[16];       { Unencoded Token     }


const MaxEntry = 100;

var ST   : array[1..MaxEntry] of Symbol;
   SType: array[1..MaxEntry] of char;


{--------------------------------------------------------------}
{ Definition of Keywords and Token Types }

const NKW =   9;
     NKW1 = 10;

const KWlist: array[1..NKW] of Symbol =
             ('IF', 'ELSE', 'ENDIF', 'WHILE', 'ENDWHILE',
              'READ', 'WRITE', 'VAR', 'END');

const KWcode: string[NKW1] = 'xileweRWve';


{--------------------------------------------------------------}
{ Read New Character From Input Stream }

procedure GetChar;
begin
  Read(Look);
end;

{--------------------------------------------------------------}
{ Report an Error }

procedure Error(s: string);
begin
  WriteLn;
  WriteLn(^G, 'Error: ', s, '.');
end;


{--------------------------------------------------------------}
{ Report Error and Halt }

procedure Abort(s: string);
begin
  Error(s);
  Halt;
end;


{--------------------------------------------------------------}
{ Report What Was Expected }

procedure Expected(s: string);
begin
  Abort(s + ' Expected');
end;

{--------------------------------------------------------------}
{ Report an Undefined Identifier }

procedure Undefined(n: string);
begin
  Abort('Undefined Identifier ' + n);
end;


{--------------------------------------------------------------}
{ Report a Duplicate Identifier }

procedure Duplicate(n: string);
begin
  Abort('Duplicate Identifier ' + n);
end;


{--------------------------------------------------------------}
{ Check to Make Sure the Current Token is an Identifier }

procedure CheckIdent;
begin
  if Token <> 'x' then Expected('Identifier');
end;


{--------------------------------------------------------------}
{ Recognize an Alpha Character }

function IsAlpha(c: char): boolean;
begin
  IsAlpha := UpCase(c) in ['A'..'Z'];
end;


{--------------------------------------------------------------}
{ Recognize a Decimal Digit }

function IsDigit(c: char): boolean;
begin
  IsDigit := c in ['0'..'9'];
end;


{--------------------------------------------------------------}
{ Recognize an AlphaNumeric Character }

function IsAlNum(c: char): boolean;
begin
  IsAlNum := IsAlpha(c) or IsDigit(c);
end;


{--------------------------------------------------------------}
{ Recognize an Addop }

function IsAddop(c: char): boolean;
begin
  IsAddop := c in ['+', '-'];
end;


{--------------------------------------------------------------}
{ Recognize a Mulop }

function IsMulop(c: char): boolean;
begin
  IsMulop := c in ['*', '/'];
end;


{--------------------------------------------------------------}
{ Recognize a Boolean Orop }

function IsOrop(c: char): boolean;
begin
  IsOrop := c in ['|', '~'];
end;


{--------------------------------------------------------------}
{ Recognize a Relop }

function IsRelop(c: char): boolean;
begin
  IsRelop := c in ['=', '#', '<', '>'];
end;


{--------------------------------------------------------------}
{ Recognize White Space }

function IsWhite(c: char): boolean;
begin
  IsWhite := c in [' ', TAB, CR, LF];
end;


{--------------------------------------------------------------}
{ Skip Over Leading White Space }

procedure SkipWhite;
begin
  while IsWhite(Look) do
     GetChar;
end;


{--------------------------------------------------------------}
{ Table Lookup }

function Lookup(T: TabPtr; s: string; n: integer): integer;
var i: integer;
   found: Boolean;
begin
  found := false;
  i := n;
  while (i > 0) and not found do
     if s = T^[i] then
        found := true
     else
        dec(i);
  Lookup := i;
end;


{--------------------------------------------------------------}
{ Locate a Symbol in Table }
{ Returns the index of the entry.  Zero if not present. }

function Locate(N: Symbol): integer;
begin
  Locate := Lookup(@ST, n, NEntry);
end;


{--------------------------------------------------------------}
{ Look for Symbol in Table }

function InTable(n: Symbol): Boolean;
begin
  InTable := Lookup(@ST, n, NEntry) <> 0;
end;


{--------------------------------------------------------------}
{ Check to See if an Identifier is in the Symbol Table         }
{ Report an error if it's not. }


procedure CheckTable(N: Symbol);
begin
  if not InTable(N) then Undefined(N);
end;


{--------------------------------------------------------------}
{ Check the Symbol Table for a Duplicate Identifier }
{ Report an error if identifier is already in table. }


procedure CheckDup(N: Symbol);
begin
  if InTable(N) then Duplicate(N);
end;


{--------------------------------------------------------------}
{ Add a New Entry to Symbol Table }

procedure AddEntry(N: Symbol; T: char);
begin
  CheckDup(N);
  if NEntry = MaxEntry then Abort('Symbol Table Full');
  Inc(NEntry);
  ST[NEntry] := N;
  SType[NEntry] := T;
end;


{--------------------------------------------------------------}
{ Get an Identifier }

procedure GetName;
begin
  SkipWhite;
  if Not IsAlpha(Look) then Expected('Identifier');
  Token := 'x';
  Value := '';
  repeat
     Value := Value + UpCase(Look);
     GetChar;
  until not IsAlNum(Look);
end;


{--------------------------------------------------------------}
{ Get a Number }

procedure GetNum;
begin
  SkipWhite;
  if not IsDigit(Look) then Expected('Number');
  Token := '#';
  Value := '';
  repeat
     Value := Value + Look;
     GetChar;
  until not IsDigit(Look);
end;


{--------------------------------------------------------------}
{ Get an Operator }

procedure GetOp;
begin
  SkipWhite;
  Token := Look;
  Value := Look;
  GetChar;
end;


{--------------------------------------------------------------}
{ Get the Next Input Token }

procedure Next;
begin
  SkipWhite;
  if IsAlpha(Look) then GetName
  else if IsDigit(Look) then GetNum
  else GetOp;
end;


{--------------------------------------------------------------}
{ Scan the Current Identifier for Keywords }

procedure Scan;
begin
  if Token = 'x' then
     Token := KWcode[Lookup(Addr(KWlist), Value, NKW) + 1];
end;


{--------------------------------------------------------------}
{ Match a Specific Input String }

procedure MatchString(x: string);
begin
  if Value <> x then Expected('''' + x + '''');
  Next;
end;


{--------------------------------------------------------------}
{ Output a String with Tab }

procedure Emit(s: string);
begin
  Write(TAB, s);
end;


{--------------------------------------------------------------}
{ Output a String with Tab and CRLF }

procedure EmitLn(s: string);
begin
  Emit(s);
  WriteLn;
end;


{--------------------------------------------------------------}
{ Generate a Unique Label }

function NewLabel: string;
var S: string;
begin
  Str(LCount, S);
  NewLabel := 'L' + S;
  Inc(LCount);
end;


{--------------------------------------------------------------}
{ Post a Label To Output }

procedure PostLabel(L: string);
begin
  WriteLn(L, ':');
end;


{---------------------------------------------------------------}
{ Clear the Primary Register }

procedure Clear;
begin
  EmitLn('CLR D0');
end;


{---------------------------------------------------------------}
{ Negate the Primary Register }

procedure Negate;
begin
  EmitLn('NEG D0');
end;


{---------------------------------------------------------------}
{ Complement the Primary Register }

procedure NotIt;
begin
  EmitLn('NOT D0');
end;


{---------------------------------------------------------------}
{ Load a Constant Value to Primary Register }

procedure LoadConst(n: string);
begin
  Emit('MOVE #');
  WriteLn(n, ',D0');
end;


{---------------------------------------------------------------}
{ Load a Variable to Primary Register }

procedure LoadVar(Name: string);
begin
  if not InTable(Name) then Undefined(Name);
  EmitLn('MOVE ' + Name + '(PC),D0');
end;


{---------------------------------------------------------------}
{ Push Primary onto Stack }

procedure Push;
begin
  EmitLn('MOVE D0,-(SP)');
end;


{---------------------------------------------------------------}
{ Add Top of Stack to Primary }

procedure PopAdd;
begin
  EmitLn('ADD (SP)+,D0');
end;


{---------------------------------------------------------------}
{ Subtract Primary from Top of Stack }

procedure PopSub;
begin
  EmitLn('SUB (SP)+,D0');
  EmitLn('NEG D0');
end;


{---------------------------------------------------------------}
{ Multiply Top of Stack by Primary }

procedure PopMul;
begin
  EmitLn('MULS (SP)+,D0');
end;


{---------------------------------------------------------------}
{ Divide Top of Stack by Primary }

procedure PopDiv;
begin
  EmitLn('MOVE (SP)+,D7');
  EmitLn('EXT.L D7');
  EmitLn('DIVS D0,D7');
  EmitLn('MOVE D7,D0');
end;


{---------------------------------------------------------------}
{ AND Top of Stack with Primary }

procedure PopAnd;
begin
  EmitLn('AND (SP)+,D0');
end;


{---------------------------------------------------------------}
{ OR Top of Stack with Primary }

procedure PopOr;
begin
  EmitLn('OR (SP)+,D0');
end;


{---------------------------------------------------------------}
{ XOR Top of Stack with Primary }

procedure PopXor;
begin
  EmitLn('EOR (SP)+,D0');
end;


{---------------------------------------------------------------}
{ Compare Top of Stack with Primary }

procedure PopCompare;
begin
  EmitLn('CMP (SP)+,D0');
end;


{---------------------------------------------------------------}
{ Set D0 If Compare was = }

procedure SetEqual;
begin
  EmitLn('SEQ D0');
  EmitLn('EXT D0');
end;


{---------------------------------------------------------------}
{ Set D0 If Compare was != }

procedure SetNEqual;
begin
  EmitLn('SNE D0');
  EmitLn('EXT D0');
end;


{---------------------------------------------------------------}
{ Set D0 If Compare was > }

procedure SetGreater;
begin
  EmitLn('SLT D0');
  EmitLn('EXT D0');
end;


{---------------------------------------------------------------}
{ Set D0 If Compare was < }

procedure SetLess;
begin
  EmitLn('SGT D0');
  EmitLn('EXT D0');
end;


{---------------------------------------------------------------}
{ Set D0 If Compare was <= }

procedure SetLessOrEqual;
begin
  EmitLn('SGE D0');
  EmitLn('EXT D0');
end;


{---------------------------------------------------------------}
{ Set D0 If Compare was >= }

procedure SetGreaterOrEqual;
begin
  EmitLn('SLE D0');
  EmitLn('EXT D0');
end;


{---------------------------------------------------------------}
{ Store Primary to Variable }

procedure Store(Name: string);
begin
  EmitLn('LEA ' + Name + '(PC),A0');
  EmitLn('MOVE D0,(A0)')
end;


{---------------------------------------------------------------}
{ Branch Unconditional  }

procedure Branch(L: string);
begin
  EmitLn('BRA ' + L);
end;


{---------------------------------------------------------------}
{ Branch False }

procedure BranchFalse(L: string);
begin
  EmitLn('TST D0');
  EmitLn('BEQ ' + L);
end;


{---------------------------------------------------------------}
{ Read Variable to Primary Register }

procedure ReadIt(Name: string);
begin
  EmitLn('BSR READ');
  Store(Name);
end;


{ Write from Primary Register }

procedure WriteIt;
begin
  EmitLn('BSR WRITE');
end;


{--------------------------------------------------------------}
{ Write Header Info }

procedure Header;
begin
  WriteLn('WARMST', TAB, 'EQU $A01E');
end;


{--------------------------------------------------------------}
{ Write the Prolog }

procedure Prolog;
begin
  PostLabel('MAIN');
end;


{--------------------------------------------------------------}
{ Write the Epilog }

procedure Epilog;
begin
  EmitLn('DC WARMST');
  EmitLn('END MAIN');
end;


{---------------------------------------------------------------}
{ Allocate Storage for a Static Variable }

procedure Allocate(Name, Val: string);
begin
  WriteLn(Name, ':', TAB, 'DC ', Val);
end;


{---------------------------------------------------------------}
{ Parse and Translate a Math Factor }

procedure BoolExpression; Forward;

procedure Factor;
begin
  if Token = '(' then begin
     Next;
     BoolExpression;
     MatchString(')');
     end
  else begin
     if Token = 'x' then
        LoadVar(Value)
     else if Token = '#' then
        LoadConst(Value)
     else Expected('Math Factor');
     Next;
  end;
end;


{--------------------------------------------------------------}
{ Recognize and Translate a Multiply }

procedure Multiply;
begin
  Next;
  Factor;
  PopMul;
end;


{-------------------------------------------------------------}
{ Recognize and Translate a Divide }

procedure Divide;
begin
  Next;
  Factor;
  PopDiv;
end;


{---------------------------------------------------------------}
{ Parse and Translate a Math Term }

procedure Term;
begin
  Factor;
  while IsMulop(Token) do begin
     Push;
     case Token of
      '*': Multiply;
      '/': Divide;
     end;
  end;
end;


{--------------------------------------------------------------}
{ Recognize and Translate an Add }

procedure Add;
begin
  Next;
  Term;
  PopAdd;
end;


{-------------------------------------------------------------}
{ Recognize and Translate a Subtract }

procedure Subtract;
begin
  Next;
  Term;
  PopSub;
end;


{---------------------------------------------------------------}
{ Parse and Translate an Expression }

procedure Expression;
begin
  if IsAddop(Token) then
     Clear
  else
     Term;
  while IsAddop(Token) do begin
     Push;
     case Token of
      '+': Add;
      '-': Subtract;
     end;
  end;
end;


{---------------------------------------------------------------}
{ Get Another Expression and Compare }

procedure CompareExpression;
begin
  Expression;
  PopCompare;
end;


{---------------------------------------------------------------}
{ Get The Next Expression and Compare }

procedure NextExpression;
begin
  Next;
  CompareExpression;
end;


{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Equals" }

procedure Equal;
begin
  NextExpression;
  SetEqual;
end;


{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Less Than or Equal" }

procedure LessOrEqual;
begin
  NextExpression;
  SetLessOrEqual;
end;


{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Not Equals" }

procedure NotEqual;
begin
  NextExpression;
  SetNEqual;
end;


{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Less Than" }

procedure Less;
begin
  Next;
  case Token of
    '=': LessOrEqual;
    '>': NotEqual;
  else begin
          CompareExpression;
          SetLess;
       end;
  end;
end;


{---------------------------------------------------------------}
{ Recognize and Translate a Relational "Greater Than" }

procedure Greater;
begin
  Next;
  if Token = '=' then begin
     NextExpression;
     SetGreaterOrEqual;
     end
  else begin
     CompareExpression;
     SetGreater;
  end;
end;


{---------------------------------------------------------------}
{ Parse and Translate a Relation }


procedure Relation;
begin
  Expression;
  if IsRelop(Token) then begin
     Push;
     case Token of
      '=': Equal;
      '<': Less;
      '>': Greater;
     end;
  end;
end;


{---------------------------------------------------------------}
{ Parse and Translate a Boolean Factor with Leading NOT }

procedure NotFactor;
begin
  if Token = '!' then begin
     Next;
     Relation;
     NotIt;
     end
  else
     Relation;
end;


{---------------------------------------------------------------}
{ Parse and Translate a Boolean Term }

procedure BoolTerm;
begin
  NotFactor;
  while Token = '&' do begin
     Push;
     Next;
     NotFactor;
     PopAnd;
  end;
end;


{--------------------------------------------------------------}
{ Recognize and Translate a Boolean OR }

procedure BoolOr;
begin
  Next;
  BoolTerm;
  PopOr;
end;


{--------------------------------------------------------------}
{ Recognize and Translate an Exclusive Or }

procedure BoolXor;
begin
  Next;
  BoolTerm;
  PopXor;
end;


{---------------------------------------------------------------}
{ Parse and Translate a Boolean Expression }

procedure BoolExpression;
begin
  BoolTerm;
  while IsOrOp(Token) do begin
     Push;
     case Token of
      '|': BoolOr;
      '~': BoolXor;
     end;
  end;
end;


{--------------------------------------------------------------}
{ Parse and Translate an Assignment Statement }

procedure Assignment;
var Name: string;
begin
  CheckTable(Value);
  Name := Value;
  Next;
  MatchString('=');
  BoolExpression;
  Store(Name);
end;


{---------------------------------------------------------------}
{ Recognize and Translate an IF Construct }

procedure Block; Forward;

procedure DoIf;
var L1, L2: string;
begin
  Next;
  BoolExpression;
  L1 := NewLabel;
  L2 := L1;
  BranchFalse(L1);
  Block;
  if Token = 'l' then begin
     Next;
     L2 := NewLabel;
     Branch(L2);
     PostLabel(L1);
     Block;
  end;
  PostLabel(L2);
  MatchString('ENDIF');
end;


{--------------------------------------------------------------}
{ Parse and Translate a WHILE Statement }

procedure DoWhile;
var L1, L2: string;
begin
  Next;
  L1 := NewLabel;
  L2 := NewLabel;
  PostLabel(L1);
  BoolExpression;
  BranchFalse(L2);
  Block;
  MatchString('ENDWHILE');
  Branch(L1);
  PostLabel(L2);
end;


{--------------------------------------------------------------}
{ Read a Single Variable }

procedure ReadVar;
begin
  CheckIdent;
  CheckTable(Value);
  ReadIt(Value);
  Next;
end;


{--------------------------------------------------------------}
{ Process a Read Statement }

procedure DoRead;
begin
  Next;
  MatchString('(');
  ReadVar;
  while Token = ',' do begin
     Next;
     ReadVar;
  end;
  MatchString(')');
end;


{--------------------------------------------------------------}
{ Process a Write Statement }

procedure DoWrite;
begin
  Next;
  MatchString('(');
  Expression;
  WriteIt;
  while Token = ',' do begin
     Next;
     Expression;
     WriteIt;
  end;
  MatchString(')');
end;


{--------------------------------------------------------------}
{ Parse and Translate a Block of Statements }

procedure Block;
begin
  Scan;
  while not(Token in ['e', 'l']) do begin
     case Token of
      'i': DoIf;
      'w': DoWhile;
      'R': DoRead;
      'W': DoWrite;
     else Assignment;
     end;
     Scan;
  end;
end;


{--------------------------------------------------------------}
{ Allocate Storage for a Variable }

procedure Alloc;
begin
  Next;
  if Token <> 'x' then Expected('Variable Name');
  CheckDup(Value);
  AddEntry(Value, 'v');
  Allocate(Value, '0');
  Next;
end;


{--------------------------------------------------------------}
{ Parse and Translate Global Declarations }

procedure TopDecls;
begin
  Scan;
  while Token = 'v' do
     Alloc;
     while Token = ',' do
        Alloc;
end;


{--------------------------------------------------------------}
{ Initialize }

procedure Init;
begin
  GetChar;
  Next;
end;


{--------------------------------------------------------------}
{ Main Program }

begin
  Init;
  MatchString('PROGRAM');
  Header;
  TopDecls;
  MatchString('BEGIN');
  Prolog;
  Block;
  MatchString('END');
  Epilog;
end.
{--------------------------------------------------------------}
*****************************************************************
*                                                               *
*                        COPYRIGHT NOTICE                       *
*                                                               *
*   Copyright (C) 1989 Jack W. Crenshaw. All rights reserved.   *
*                                                               *
*****************************************************************