NAME
   Makefile::DOM - Simple DOM parser for Makefiles

VERSION
   This document describes Makefile::DOM 0.007 released on 18 November
   2014.

DESCRIPTION
   This libary can serve as an advanced lexer for (GNU) makefiles. It
   parses makefiles as "documents" and the parsing is lossless. The results
   are data structures similar to DOM trees. The DOM trees hold every
   single bit of the information in the original input files, including
   white spaces, blank lines and makefile comments. That means it's
   possible to reproduce the original makefiles from the DOM trees. In
   addition, each node of the DOM trees is modifiable and so is the whole
   tree, just like the PPI module used for Perl source parsing and the
   HTML::TreeBuilder module used for parsing HTML source.

   If you're looking for a true GNU make parser that generates an AST,
   please see Makefile::Parser::GmakeDB instead.

   The interface of "Makefile::DOM" mimics the API design of PPI. In fact,
   I've directly stolen the source code and POD documentation of PPI::Node,
   PPI::Element, and PPI::Dumper, with the full permission from the author
   of PPI, Adam Kennedy.

   "Makefile::DOM" tries to be independent of specific makefile's syntax.
   The same set of DOM node types is supposed to get shared by different
   makefile DOM generators. For example, MDOM::Document::Gmake parses GNU
   makefiles and returns an instance of MDOM::Document, i.e., the root of
   the DOM tree while the NMAKE makefile lexer in the future,
   "MDOM::Document::Nmake", also returns instances of the MDOM::Document
   class. Later, I'll also consider adding support for dmake and bsdmake.

Structure of the DOM
   Makefile DOM (MDOM) is a structured set of a series of data types. They
   provide a flexible document model conformed to the makefile syntax.
   Below is a complete list of the 19 MDOM classes in the current
   implementation where the indentation indicates the class inheritance
   relationships.

       MDOM::Element
           MDOM::Node
               MDOM::Unknown
               MDOM::Assignment
               MDOM::Command
               MDOM::Directive
               MDOM::Document
                   MDOM::Document::Gmake
               MDOM::Rule
                   MDOM::Rule::Simple
                   MDOM::Rule::StaticPattern
           MDOM::Token
               MDOM::Token::Bare
               MDOM::Token::Comment
               MDOM::Token::Continuation
               MDOM::Token::Interpolation
               MDOM::Token::Modifier
               MDOM::Token::Separator
               MDOM::Token::Whitespace

   It's not hard to see that all of the MDOM classes inherit from the
   MDOM::Element class. MDOM::Token and MDOM::Node are its direct children.
   The former represents a string token which is atomic from the
   perspective of the lexer while the latter represents a structured node,
   which usually has one or more children, and serves as the container for
   other DOM::Element objects.

   Next we'll show a few examples to demonstrate how to map DOM trees to
   particular makefiles.

   Case 1
       Consider the following simple "hello, world" makefile:

           all : ; echo "hello, world"

       We can use the MDOM::Dumper class provided by Makefile::DOM to dump
       out the internal structure of its corresponding MDOM tree:

           MDOM::Document::Gmake
             MDOM::Rule::Simple
               MDOM::Token::Bare         'all'
               MDOM::Token::Whitespace   ' '
               MDOM::Token::Separator    ':'
               MDOM::Token::Whitespace   ' '
               MDOM::Command
                 MDOM::Token::Separator    ';'
                 MDOM::Token::Whitespace   ' '
                 MDOM::Token::Bare         'echo "hello, world"'
                 MDOM::Token::Whitespace   '\n'

       In this example, speparators ":" and ";" are all instances of the
       MDOM::Token::Separator class while spaces and new line characters
       are all represented as MDOM::Token::Whitespace. The other two leaf
       nodes, "all" and "echo "hello, world"" both belong to
       MDOM::Token::Bare.

       It's worth mentioning that, the space characters in the rule command
       "echo "hello, world"" were not represented as
       MDOM::Token::Whitespace. That's because in makefiles, the spaces in
       commands do not make any sense to "make" in syntax; those spaces are
       usually sent to shell programs verbatim. Therefore, the DOM parser
       does not try to recognize those spaces specifially so as to reduce
       memory use and the number of nodes. However, leading spaces and
       trailing new lines will still be recognized as
       MDOM::Token::Whitespace.

       On a higher level, it's a MDOM::Rule::Simple instance holding
       several "Token" and one MDOM::Command. On the highest level, it's
       the root node of the whole DOM tree, i.e., an instance of
       MDOM::Document::Gmake.

   Case 2
       Below is a relatively complex example:

           a: foo.c  bar.h $(baz) # hello!
               @echo ...

       It's corresponding DOM structure is

         MDOM::Document::Gmake
           MDOM::Rule::Simple
             MDOM::Token::Bare         'a'
             MDOM::Token::Separator    ':'
             MDOM::Token::Whitespace   ' '
             MDOM::Token::Bare         'foo.c'
             MDOM::Token::Whitespace   '  '
             MDOM::Token::Bare         'bar.h'
             MDOM::Token::Whitespace   '\t'
             MDOM::Token::Interpolation   '$(baz)'
             MDOM::Token::Whitespace      ' '
             MDOM::Token::Comment         '# hello!'
             MDOM::Token::Whitespace      '\n'
           MDOM::Command
             MDOM::Token::Separator    '\t'
             MDOM::Token::Modifier     '@'
             MDOM::Token::Bare         'echo ...'
             MDOM::Token::Whitespace   '\n'

       Compared to the previous example, here appears several new node
       types.

       The variable interpolation "$(baz)" on the first line of the
       original makefile corresponds to a MDOM::Token::Interpolation node
       in its MDOM tree. Similarly, the comment "# hello" corresponds to a
       MDOM::Token::Comment node.

       On the second line, the rule command indented by a tab character is
       still represented by a MDOM::Command object. Its first child node
       (or its first element) is also an MDOM::Token::Seperator instance
       corresponding to that tab. The command modifier "@" follows the
       "Separator" immediately, which is of type MDOM::Token::Modifier.

   Case 3
       Now let's study a sample makefile with various global structures:

         a: b
         foo = bar
             # hello!

       Here on the top level, there are three language structures: one rule
       ""a: b"", one assignment statement "foo = bar", and one comment "#
       hello!".

       Its MDOM tree is shown below:

         MDOM::Document::Gmake
           MDOM::Rule::Simple
             MDOM::Token::Bare                  'a'
             MDOM::Token::Separator            ':'
             MDOM::Token::Whitespace           ' '
             MDOM::Token::Bare                   'b'
             MDOM::Token::Whitespace           '\n'
           MDOM::Assignment
             MDOM::Token::Bare                  'foo'
             MDOM::Token::Whitespace           ' '
             MDOM::Token::Separator            '='
             MDOM::Token::Whitespace           ' '
             MDOM::Token::Bare                  'bar'
             MDOM::Token::Whitespace           '\n'
           MDOM::Token::Whitespace            '\t'
           MDOM::Token::Comment               '# hello!'
           MDOM::Token::Whitespace            '\n'

       We can see that below the root node MDOM::Document::Gmake, there are
       MDOM::Rule::Simple, MDOM::Assignment, and MDOM::Comment three
       elements, as well as two MDOM::Token::Whitespace objects.

   It can be observed from the examples above that the MDOM representation
   for the makefile's lexical elements is rather loose. It only provides
   very limited structural representation instead of making a bad guess.

OPERATIONS FOR MDOM TREES
   Generating an MDOM tree from a GNU makefile only requires two lines of
   Perl code:

       use MDOM::Document::Gmake;
       my $dom = MDOM::Document::Gmake->new('Makefile');

   If the makefile source code being parsed is already stored in a Perl
   variable, say, $var, then we can construct an MDOM via the following
   code:

       my $dom = MDOM::Document::Gmake->new(\$var);

   Now $dom becomes the reference to the root of the MDOM tree and its type
   is now MDOM::Document::Gmake, which is also an instance of the
   MDOM::Node class.

   Just as mentioned above, "MDOM::Node" is the container for other
   MDOM::Element instances. So we can retrieve some element node's value
   via its "child" method:

       $node = $dom->child(3);
       # or $node = $dom->elements(0);

   And we may also use the "elements" method to obtain the values of all
   the nodes:

       @elems = $dom->elements;

   For every MDOM node, its corresponding makefile source can be generated
   by invoking its "content" method.

BUGS AND TODO
   The current implementation of the MDOM::Document::Gmake lexer is based
   on a hand-written state machie. Although the efficiency of the engine is
   not bad, the code is rather complicated and messy, which hurts both
   extensibility and maintanabilty. So it's expected to rewrite the parser
   using some grammatical tools like the Perl 6 regex engine
   Pugs::Compiler::Rule or a yacc-style one like Parse::Yapp.

SOURCE REPOSITORY
   You can always get the latest source code of this module from its GitHub
   repository:

   <http://github.com/agentzh/makefile-dom-pm>

   If you want a commit bit, please let me know.

AUTHOR
   Yichun "agentzh" Zhang (章亦春) <[email protected]>

COPYRIGHT
   Copyright 2006-2014 by Yichun "agentzh" Zhang (章亦春).

   This library is free software; you can redistribute it and/or modify it
   under the same terms as Perl itself.

SEE ALSO
   MDOM::Document, MDOM::Document::Gmake, PPI, Makefile::Parser::GmakeDB,
   makesimple.