Lab 7 - Doing Math in the Correct Order

 Introduction
 ============
 You have certainly come a long way!  That AVL tree was pretty tough.
 This time around, we are going to build a parse tree.  Really, parse
 trees aren't too difficult to build, so long as you have a good set
 of pseudocode (from your class notes) and a good understanding of
 what trees look like in memory.

 In this lab you will be building a parse tree using our btree.h
 file.  You will not use the AVL tree for this, as this is a
 different type of tree.  The final product should be able to take in
 a mathematical expression in standard notation and compute the
 correct answer according to the standard order of operations.

 In order to do this, you will need a lexer.  The lexer is actually
 the hard part of this, so I have provided that for you.  Also,
 you'll need a healthy dose of tree iterators, which we wrote in
 class.  I've included those in the guided part as well.

 Once you are done with this, you will have the beginnings of a small
 compiler!  Add in a little code generation, and you're set to write
 your own programming language.

 So now, with that bit of a pep talk, let's dig in.


 The Lexer and Tree Iterators
 ============================
 The first order of business is to create a lexer.  A lexer is just a
 function or program which decomposes a string into lexical elements.
 That way you can reason about things like the order of operations in
 the tree, and not worry about how you handle multiple characters in
 a number, or decimal points, etc.  I have written a very simple
 lexer which I want you to copy and try out.  This one can decompose
 a string into double values and into operators.  It also sets the
 precedence level according to the order of operations, where the
 higher the level, the deeper it goes in the tree.

 Let's start with the header file:

 --begin lexer.h--
    1  /* A simple lexer for mathematical statements */
    2
    3  #ifndef LEXER_H
    4  #define LEXER_H
    5  #include <string>
    6  #include <iostream>
    7
    8  struct LexicalElement {
    9    enum {NUM, OP, UNKNOWN} type;       //element type
   10    std::string             text;       //element text
   11    double                  val;        //element numeric value
   12    enum {MUL, DIV, MOD, ADD, SUB} op;  //operator value
   13    int                     p;          //precedence level
   14  };
   15
   16
   17  LexicalElement nextElement(std::istream &is);
   18  #endif
 --end lexer.h--

 As you can see, the interface is pretty straightforward.  The
 LexicalElement structure contains all the information we need to
 process the statement one element at a time.  The nextElement
 function returns the next lexical element from an istream.  In our
 case, we will be using cin as the istream.  The implementation file
 for this function follows:


 --begin lexer.cpp--
    1  #include "lexer.h"
    2  #include <cctype>
    3  #include <cstdlib>
    4  #include <iostream>
    5
    6  using namespace std;
    7
    8  //static helper functions
    9  static bool isoperator(char c);
   10  static void handleOperator(LexicalElement &l);
   11  static void handleNumber(LexicalElement &l);
   12
   13  /* returns the next lexical element in is */
   14  LexicalElement nextElement(istream &is) {
   15    LexicalElement result;  //the resulting element
   16    bool done = false;      //indicates if we are done
   17    bool decimal;           //indicates if we have hit a decimal
   18    char c;                 //the current character
   19
   20    result.type = LexicalElement::UNKNOWN;
   21
   22    //scan until we've encountered all the characters
   23    while(!done) {
   24      //peek at the character
   25      c = cin.peek();
   26
   27      //start a type if needed
   28      if(result.type == LexicalElement::UNKNOWN) {
   29        if(isdigit(c) || c == '.') {
   30          result.type = LexicalElement::NUM;
   31          decimal = false;  //no decimal seen yet!
   32        } else if (isoperator(c)) {
   33          result.type = LexicalElement::OP;
   34        } else if(isspace(c)) {
   35          cin.get();  //consume spaces
   36        } else {
   37          done = true;
   38        }
   39      }
   40
   41      //consume characters (if needed)
   42      if(result.type == LexicalElement::OP) {
   43        if(isoperator(c)) {
   44          c = cin.get();
   45          result.text += c;
   46        } else {
   47          done = true;
   48        }
   49      } else if(result.type == LexicalElement::NUM) {
   50        if(c == '.' && !decimal) {
   51          //we get just one decimal point!
   52          decimal = true;
   53          c = cin.get();
   54          result.text += c;
   55        } else if(isdigit(c)) {
   56          c = cin.get();
   57          result.text += c;
   58        } else {
   59          done = true;
   60        }
   61      }
   62    }
   63
   64    if(result.type==LexicalElement::OP)
   65      handleOperator(result);
   66    else if(result.type==LexicalElement::NUM)
   67      handleNumber(result);
   68
   69    return result;
   70  }
   71
   72
   73  //static helper functions
   74  static bool
   75  isoperator(char c) {
   76    const char *operators = "*/%+-";
   77
   78    while(*operators) {
   79      if(*operators == c)
   80        return true;
   81        operators++;
   82      }
   83
   84    return false;
   85  }
   86
   87
   88  static void
   89  handleOperator(LexicalElement &l) {
   90    //assign appropriate symbol and precedence
   91    switch(l.text[0]) {
   92    case '*':
   93      l.op = LexicalElement::MUL;
   94      l.p  = 2;
   95      break;
   96    case '/':
   97      l.op = LexicalElement::DIV;
   98      l.p = 2;
   99      break;
  100    case '%':
  101      l.op = LexicalElement::MOD;
  102      l.p = 2;
  103      break;
  104    case '+':
  105      l.op = LexicalElement::ADD;
  106      l.p = 1;
  107      break;
  108    case '-':
  109      l.p = 1;
  110      break;
  111    }
  112  }
  113
  114  static void
  115  handleNumber(LexicalElement &l) {
  116    l.p = 3;
  117    l.val = atof(l.text.c_str());
  118  }
 --end lexer.cpp--

 Study this set of functions to get a feel for how the lexer works.
 Really it's all about state and consuming the right kinds of
 characters.  We won't go into a detailed explanation of how it works
 here, largely because there should be enough comments in the code to
 show you how it does what it does.  The final step is to build a
 little test application:

 --begin lexerTest.cpp--
    1  /*
    2   * A small test program for the lexer.
    3   */
    4
    5  #include <iostream>
    6  #include "lexer.h"
    7
    8
    9  using namespace std;
   10
   11  int main(void) {
   12    LexicalElement l;
   13
   14    //run until the end of the line
   15    while(cin.peek() != '\n') {
   16      l = nextElement(cin);
   17
   18      //print according to the type
   19      switch(l.type) {
   20      case LexicalElement::NUM:
   21        cout << "NUM " << l.val << " P: " << l.p << endl;
   22        break;
   23      case LexicalElement::OP:
   24        cout << "OP  " << l.text << ' ';
   25
   26        //render the op enum
   27        switch(l.op) {
   28        case LexicalElement::MUL:
   29          cout << "MUL";
   30          break;
   31        case LexicalElement::DIV:
   32          cout << "DIV";
   33          break;
   34        case LexicalElement::MOD:
   35          cout << "MOD";
   36          break;
   37        case LexicalElement::ADD:
   38          cout << "ADD";
   39          break;
   40        case LexicalElement::SUB:
   41          cout << "SUB";
   42          break;
   43        }
   44
   45        //display p value
   46        cout << " P: " << l.p << endl;
   47        break;
   48      case LexicalElement::UNKNOWN:
   49        cout << "UNKOWN" << endl;
   50        break;
   51      }
   52    }
   53
   54    return 0;
   55  }
 --end lexerTest.cpp--

 This test driver reads in a line of text, and prints out a rendering
 of the lexical elements.  A couple of sample runs follows:
   $ ./lexerTest
   2+2
   NUM 2 P: 3
   OP  + ADD P: 1
   NUM 2 P: 3

   $ ./lexerTest
   3+.5 - 2000 + 6 /2.1
   NUM 3 P: 3
   OP  + ADD P: 1
   NUM 0.5 P: 3
   OP  - ADD P: 1
   NUM 2000 P: 3
   OP  + ADD P: 1
   NUM 6 P: 3
   OP  / DIV P: 2
   NUM 2.1 P: 3

 Study the test driver to make sure you understand how to use the
 lexer.

 Now all that remains is to do the tree iterators which go with
 btree.h.  (You should have the btree.h file from lab6.  Go ahead and
 copy that into your current directory).  Here, we just use the
 treeItr.h file and the btreeTest.cpp file that we wrote in class.
 Here they are, just in case you didn't manage to type them all in.


 --begin treeItr.h--
    1  #include "btree.h"
    2
    3
    4  //helper functions
    5  template<typename T> BinaryTree<T>*
    6  leftMost(BinaryTree<T> *t) {
    7    //protect against null trees
    8    if(!t)
    9      return NULL;
   10
   11    BinaryTree<T> *left = t->getLeft();
   12
   13    //go left iteratively
   14    while(left) {
   15      t = left;
   16      left = t->getLeft();
   17    }
   18
   19    return t;
   20  }
   21
   22  //and iterator which visits a tree in in-order traversal order.
   23  template <typename T>
   24  class InOrderIterator {
   25    public:
   26    //constructor
   27    InOrderIterator(BinaryTree<T> *t) {
   28      this->t = leftMost(t);
   29    }
   30
   31
   32    //prefix iterator
   33    InOrderIterator &operator++() {
   34      advance();  //go to the next node
   35      return *this;
   36    }
   37
   38
   39    //postfix iterator
   40    InOrderIterator operator++(int) {
   41      InOrderIterator itr(*this);
   42      advance();
   43      return itr;
   44    }
   45
   46
   47    //dereference operator
   48    T operator*() {
   49      return t->getData();
   50    }
   51
   52
   53    //return true if we are past the end
   54    bool eot() {
   55      return !t;
   56    }
   57
   58    private:
   59    BinaryTree<T> *t;   //current node
   60
   61    void advance() {
   62      //protect against the end
   63      if(!t) return;
   64
   65      if(!t->getRight()) {
   66        //go up
   67        BinaryTree<T> *p=t;
   68        do {
   69          t = p;
   70          p  = t->getParent();
   71
   72          //if this was the root, we are done
   73          if(!p) {
   74            t = p;
   75            return;
   76          }
   77        } while(p->getLeft() != t);
   78
   79        //move up
   80        t  = p;
   81      } else {
   82        //go the left most child of the right tree
   83        t = leftMost(t->getRight());
   84      }
   85    }
   86  };
   87
   88
   89
   90  //and iterator which visits a tree in pre-order traversal order.
   91  template <typename T>
   92  class PreOrderIterator {
   93    public:
   94    //constructor
   95    PreOrderIterator(BinaryTree<T> *t) {
   96      this->t = t;
   97    }
   98
   99
  100    //prefix iterator
  101    PreOrderIterator &operator++() {
  102      advance();  //go to the next node
  103      return *this;
  104    }
  105
  106
  107    //postfix iterator
  108    PreOrderIterator operator++(int) {
  109      PreOrderIterator itr(*this);
  110      advance();
  111      return itr;
  112    }
  113
  114
  115    //dereference operator
  116    T operator*() {
  117      return t->getData();
  118    }
  119
  120
  121    //return true if we are past the end
  122    bool eot() {
  123      return !t;
  124    }
  125
  126    private:
  127    BinaryTree<T> *t;   //current node
  128
  129    void advance() {
  130      //protect against the end
  131      if(!t) return;
  132
  133      if(t->getLeft()) {
  134        t = t->getLeft();
  135      } else {
  136        //go up until we approach from left and can go right
  137        BinaryTree<T> *p = t;
  138        do {
  139          t = p;
  140          p = t->getParent();
  141        } while(p && (p->getLeft() != t || !p->getRight()));
  142
  143        //done! (no more tree)
  144        if(!p) {
  145          t = p;
  146          return;
  147        }
  148        t  = p->getRight();
  149      }
  150    }
  151  };
 -- end treeItr.h --


 And now, write out this test driver, and give it all a spin to make
 sure your lexer and your tree iterators are working.


 -- begin btreeTest.cpp --
    1  /*
    2   * This is a small test driver for our btree.  It builds a naive
    3   * BST.  Behold the glories of degenerative cases!
    4   */
    5
    6  #include <iostream>
    7  #include "btree.h"
    8  #include "treeItr.h"
    9
   10  using namespace std;
   11
   12  //an implementation of the bstInsert algorithm
   13  void bstInsert(BinaryTree<int> *t, int data) {
   14    if(data <= t->getData()) {
   15      //insert left
   16      if(!t->getLeft())
   17        t->linkLeft(new BinaryTree<int>(data));
   18      else
   19        bstInsert(t->getLeft(), data);
   20    } else {
   21      //insert right
   22      if(!t->getRight())
   23        t->linkRight(new BinaryTree<int>(data));
   24      else
   25        bstInsert(t->getRight(), data);
   26    }
   27  }
   28
   29
   30  int main(void) {
   31    BinaryTree<int> *t=NULL;
   32    int data;
   33
   34    do {
   35      //prompt the user
   36      cout << "Enter Number (-1 to exit): ";
   37
   38      //get the data
   39      cin >> data;
   40
   41      if(data == -1)
   42        continue;
   43
   44      if(!t)
   45        t=new BinaryTree<int>(data);
   46      else
   47        bstInsert(t, data);
   48    } while(data != -1);
   49
   50    //print the tree
   51    t->printTree();
   52
   53    //do an inorder traversal
   54    cout << "In order traversal: ";
   55    for(InOrderIterator<int> itr(t); !itr.eot(); itr++) {
   56      cout << *itr << " ";
   57    }
   58    cout << endl;
   59
   60
   61    //do an preorder traversal
   62    cout << "preorder traversal: ";
   63    for(PreOrderIterator<int> itr(t); !itr.eot(); itr++) {
   64      cout << *itr << " ";
   65    }
   66    cout << endl;
   67
   68    return 0;
   69  }
 -- end btreeTest.cpp --


 Parsing Expressions (Challenge Lab)
 ===================================
 Ok!  Enough hand holding.  Now is the time for you to write a parse
 tree application.  You will put together the pieces of the app using
 my lexer and the pseudocode you have from class.  Your application
 should allow the user to enter an expression, and then it will
 display the answer with correct order of operations.  You don't have
 to worry about checking the correctness of what they type.  For the
 purpose of the lab, you can assume the user will always give you a
 correct instruction.

 Sample output follows:

   ./infixCalc
   Expression:  2.5 + 2.5
   Answer: 5

   ./infixCalc
   Expression: 2.5*1.2 - 5*2
   Answer: -7

 Hints:
   1.)  Tree traversal is the key to evaluating the expressions.
        That and good function decomposition.  For instance, I have a
        function called "apply" which takes an operator and 2 numbers
        and applies the correct operation.  You may want to look at
        previous examples and what you get when you do a post order
        traversal.

   2.)  Try printing your parse tree out using the print function
        already present in the BinaryTree template.  Make sure you
        are creating proper trees before you evaluate them.

   3.)  It is possible to evaluate the trees using only 3 variables.
        You will need to figure out which direction you rise from a
        child if you want to try to do this.

   4.)  Just relax.  You have 95% of this spelled out in your notes.
        You can do this!


 Extra Credit
 ============
 Using the lexer and the parsing notes we have from class, it
 wouldn't be too hard to take the next step and do error checking.
 If your program can spot syntax errors (invalid sequences of
 lexemes) then I will give you extra credit.  The better it is at
 expressing where the error is, the higher your extra credit points
 will be!

 Enjoy!