Text file INDEX generator (c) T.Jennings 7/21/81    Page 1



                 You  can do anything you want with this program  except
            sell  it.  Give  it to anyone who wants  it.  Address  bugs,
            suggestions, etc. to:

                 Tom Jennings
                 221 W. Springfield St.
                 Boston MA 02118

                 Leave me a message at NECS CBBS.

                 INDEX is a utility for use with WordStar, and generates
            an alphabetically sorted index for a file.  Words or phrases
            to  be put in the indexed are marked with control characters
            not used elswhere within WordStar.  (At least as of  version
            1.01)  If a file is later edited,  invoking INDEX again will
            remove the old index,  produce a new one,  and add it to the
            end of the file.

                 INDEX can also be use with any non-WordStar text editor
            that  can insert control characters into the text.  No other
            assumptions are made about the contents of the file,  except
            that  the  file  is  terminated  by  a  control-Z  character
            (correct way) or end of file.

                 INDEX  scans  the text file for certain  WordStar  "dot
            commands",  such as page breaks,  etc., in order to maintain
            proper page numbers. If no page "dot" commands are found, as
            with other editors, pages are counted internally.


































        Text file INDEX generator (c) T.Jennings 7/21/81    Page 2



                 There are two different kinds of index  entries;  WORDS
            and  PHRASES.  WORDS  are  what are normally thought  of  as
            words;  groups of characters,  seperated by  spaces,  commas
            carriage  returns (called CR from now on) or linefeeds (LF).
            PHRASES  are  groups of words,  including  the  spaces  that
            seperate the words.

                 Since  words are easy to find,  only a single marker is
            necessary  to  identify them.  This marker  is  a  control-K
            character,  ^K.  Phrases  must  have both ends  marked,  and
            control-P is used, ^P. Below are some examples:

            The sixth word in this ^Ksentence will be put in the index.

            ^PThis entire phrase will be there^P, also.

                 Since this is page 2 of the manual, the index for these
            should look like:

            Sentence...................................... 2
            This entire phrase............................ 2

                 These two examples are actually in the index at the end
            of this manual.

            WordStar dot commands

                 INDEX is optimized for use with WordStar.  By  default,
            it  scans  the  file for "dot  commands";  notably  .pa  and
            "..index". .PA is used to count pages, and must be the first
            word on the line to be counted as a dot command.

                 The "..index" is created and used by INDEX.  As defined
            in  the  WordStar manual,  any line beginning with two  dots
            (..) will be ignored when printed.  INDEX uses this to  mark
            the beginning of the index.  When INDEX is run,  if it finds
            the  "..index" line,  it will remove all text following that
            line. This allows creating an index for an updated file that
            already has an index. If one was not found, it is added.

                 CAUTION: NEVER put a ".." WordStar dot command followed
            by index,  as described above.  All text following this line
            will  be deleted from the file.  A single space after the ..
            will suffice, or use .IG instead.


















        Text file INDEX generator (c) T.Jennings 7/21/81    Page 3


            Sorting

                 As  stated  before,   the  index  generated  is  sorted
            alphabetically.  The  entire  phrase  or  word  is  used  in
            sorting,  except  that  case is ignored.

                 If  identical entries are found,  they are listed on  a
            single  line,   followed  by  all  page  numbers  found  on.
            Unfortunately,  multiple  identical  page  numbers  will  be
            listed.  For  clarity,  some  examples of  how  things  work
            follows.

            The  following  two  phrases  are  equivalent,  as  case  is
            ignored, and will be listed on one line. The first occurence
            will be the entry on the left side of the page.

            This is the first phrase
            THIS IS THE FIRST PHrAsE

            Since length counts, these next are all in proper order.

            This
            This is
            This is what







































        Text file INDEX generator (c) T.Jennings 7/21/81    Page 4


            Side effects and cautions

                 This is a list of implementation peculiarities, etc.

            -In general, any group of one or more white-space characters
            (see  below)  are converted into a single  space  character.
            Phrases  with  embedded spaces will have  all  extra  spaces
            (more  than  one)  removed.  A phrase may start and  end  on
            different  lines  (or even pages) and  will  work  properly.
            Leading spaces will be removed from the index entry.

            -The  following characters are converted to and treated as a
            single ASCII space character.  These also mark the end of  a
            word:

                         CR LF tab comma (,) semicolon (;)
                             colon (:) suprise-mark (!)

            -BUG  NOTICE Periods are removed from the character  stream.
            This was a cheap way out since it is a  sentence-terminator.
            The  only  time this is a problem is when putting things  in
            the index such as filenames. (i.e., FILENAME.TYP) If someone
            complains, it will probably get fixed.

            -BUG  NOTICE     The buffers for the indexed words is in  an
            array in memory.  Like most of my kludges,  there is minimal
            error  checking  done.  There is currently a limit  of  1000
            decimal words/phrases per index,  and there is a 32768  byte
            buffer made for them. If you only have 40K of memory....

            -ANNOYANCE      WordStar  control characters,  such  as  ^B,
            count as legal characters, but are not printer in the index.
            So,  if you indexed two words,  ^K^Bfoo and ^Kfoo, they will
            get seperate entries.

            -GOOD  THING     INDEX assumes you do not want to lose  your
            source  file,  and does all work in  temporary  files.  When
            invoked,  it generates a file name.IDX, and copies the input
            file  to it as it looks for words.  (see note on ..index and
            EOF) Then,  the index is put in it,  and the file is closed.
            Then  if  all  is OK,  any file  name.BAK  is  deleted,  the
            original name.ext renamed to name.BAK,  and name.IDX renamed
            to name.ext.

            -Words and phrases will have any leading spaces removed. The
            first  character of any word or phrase will be converted  to
            upper  case.  Note  that  if a phrase consists of  a  single
            blank,  it will NOT be removed from the index. This does not
            count  for words,  of course,  as the next word  that  comes
            along will be indexed.

            -Because  of wonderful CP/M,  and the fact that some of it's
            utilities  use end-of-file instead of a control-Z  character
            to  terminate text,  INDEX cannot detect the following  read
            errors: unwriten random record, zero length.








        Text file INDEX generator (c) T.Jennings 7/21/81    Page 5



            -INDEX sorts in ASCII order.  Digits,  quotes,  parenthesis,
            etc come before letters.

            -The  sort routine used is horrible.  It uses a bubble sort,
            with  extra  unnecessary  exchanges.   Didn't  require  much
            thought, though.
























































        Text file INDEX generator (c) T.Jennings 7/21/81    Page 6


               Colon...................................  4
               Comma...................................  4
               Control-Z...............................  4
               CP/M....................................  4
               CR......................................  4
               Embedded spaces.........................  4
               End-of-file.............................  4
               Examples................................  2
               Filenames...............................  4
               INDEX...................................  1
               Leading spaces..........................  4, 4
               LF......................................  4
               Non-WordStar text editor................  1
               Periods.................................  4
               PHRASES.................................  2
               Semicolon...............................  4
               Sentence................................  2
               Side effects and cautions...............  4
               Suprise-mark............................  4
               Tab.....................................  4
               This entire phrase will be there........  2
               White-space characters..................  4
               WORDS...................................  2
               WordStar................................  1
               WordStar "dot commands".................  1
               WordStar dot commands...................  2
               ^B......................................  4
               ^K......................................  2
               ^P......................................  2