* REPORT

/*
* REPORT
*
* Author: Luis Herrera
* Creation Date: Tuesday, March 24 2015, 22:52
* Last Modified: Sunday, March 29 2015, 23:24
*
* Algorithms and Data Structures.
*
*/

DESCRIPTION

The program is composed of two classes:

* WordMatch: Class that does most of the processes and,

* WordMatchLexicon: Class that contains the getter and the setter for the
* treeset that composes the lexicon. I decided to use the TreeSet as data
* structure since, by its very nature, it is designed to be kept itself in a
* lexicological order and with only one copy of the elements (no duplicates),
* therefore reducing times required for further processing and sorting when
* executing any other steps related with the program.

* This class is subject to further expansion such as a filteredLexicon and
* alternate Lexicons, in case it were later decided to save a lexicon, either
* with backup purposes, or with only certain specific words that fulfill the
* final user requirements, for example.

About the Chosen Data Structure

A few sorting algorithms were evaluated for efficiency, ease of use and
maintenance and I decided that the most optimal data structure for the purposes
of this program was the Java TreeSet Data Structure, which has the following
characteristics:

1. No duplicates. The nature of a Java Set is such that it does not allow for
any duplicates and implements into its algorithm the rejection of any.
This was one of the major considerations for choosing it above a classical
List approach, such as ArrayList or LinkedList.

2. Element sorting. Even though a HashSet is technically faster than a TreeSet
when adding or removing elements, the data is stored in no particular order
and therefore requiring more steps just to have an ordered set.
On the other hand, a TreeSet keeps the data sorted in lexicographic order
naturally since it applies a Red-Black Tree Algorithm, updating the tree
with every change done to it. As a reference, its add and remove methods
have a time complexity of O(log n).

The program works by creating an instance of WordMatchLexicon upon WordMatch
invocation (java WordMatch) and proceeding to execute the mainMenu method, which
is the platform for user interaction. It is important to mention that it is at
this stage (main method) that Java exception handling is implemented.

Besides main method and mainMenu method, there are some additional methods in
the WordMatch class in order to keep modularity of the processes that are
executed by the program. I have also tried to assign self-explanatory names to
variables and the methods themselves so as to keep the program as clear as
possible.

These additional methods are:
a) readTextFile()
This method is in charge of asking the user for the name of the file to be
read to populate the lexicon. If the user wants to add additional files to
the lexicon, s/he can do so by applying this method multiple times. Due to
the nature of the Red-Black Tree, the Set is updated, sorted
lexicographically and kept free of duplicates in real time.

b) searchWord()
As the name implies, this method is in charge of asking the user for a word
to look for, this method accepts the <?> symbol as a wild-card character.

c) WriteLexiconToFile()
This method asks the user for a file name to write the current lexicon in
memory. Even though there are faster methods to just output the data in
binary. For the purposes of this exercise, we want to be able to see the
information even when the program is not running and so, the output file is
written in a plain-text format.

d) getStringOfLettersOnly()
This is a helper method and it is used to clear every word written to the
lexicon from any characters that are NOT alphabetic.

TESTS APPLIED

Use Case 1:
Request to add to the lexicon a non-existent file.

Provided a non-existent file name. Program succesfully handled the exception
and fell back to the mainMenu.

Use Case 2:
Request to add to the lexicon a file with words and characters not alphabetical.

Program succesfully filtered numbers, special characters, tabs, newline
characters and any other character that is *not* alphabetical.

Use Case 3:
Request to add to the lexicon a file with duplicated words.

Program succesfully pruned duplicates.

Use Case 4:
Request to add to the lexicon a file with capital letters at random positions in
the words.

Program succesfully replaced upper case letters to lower case letters and
inserted them into the lexicon.

Use Case 5:
Request to search for a word in upper case letters.

Program changed user input to lowercase and succesfully proceeded with a search.

Use Case 6:
Request to search for a word using the special character <?>.

Program successfully detected the symbol and proceeded to search for words that
satisfied the search conditions requested.

Use Case 7:
Request to save lexicon to another file.

Program asked user for name of new file and saved it succesfully in a plain text
file. One word per line.

Use Case 8:
Request to add multiple files to lexicon.

Program succesfully added the files, pruned the lexicon for any duplicate files,
and corrected the input words for any special or upper-case letters.