NAME
   RPSL::Parser - Router Policy Specification Language (RFC2622) Parser

SYNOPSIS
       # The new interface doesn't requires the creation of
       # a parser object anymore:
       use RPSL::Parser;
       my $data_structure = RPSL::Parser->parse( $data );

       ###########

       # Alternativelly, use the old and deprecated interface:
       use RPSL::Parser;
       # Create a parser
       my $parser = new RPSL::Parser;
       # Use it
       my $data_structure = $parser->parse($data);

DESCRIPTION
   This is a rather simplistic lexer and tokenizer for the RPSL language.

   It currently does not validate the object in any way, it just tries
   (rather hard) to grab the biggest ammount of information it can from the
   text presented and place it in a Parse Tree (that can be passed to other
   objects from the *RPSL* namespace for validation and more RFC2622
   related functionality).

PUBLIC METHODS
 "new()" deprecated
   Constructor. Handles the accessor creation and returns a new
   RPSL::Parser object.

   This method is deprecated, under request of some users. The RPSL::Parser
   interface will change, and there will be no need to create a "parser"
   object anymore.

 "parse( [ $rpsl_source | IO::Handle | GLOB ] )"
   Parses one RPSL object for each call, uses the parser internal fields to
   store the data gathered. This is the method you need to call to
   transform your RPSL text into a Perl data structure.

   It accepts a list or a scalar containing the strings representing the
   RPSL source code you want to parse, and can read it directly from any
   IO::Handle or "GLOB" representing an open file handle.

   This is a mixture between a class and a object method at this moment,
   due to the deprecation of the "new()" method. It can detect whenever it
   was called with a class as the first parameter, and will try to
   instantiate and use that class as the parser implementation.

ACCESSOR METHODS
 "comment()"
   Stores an array reference containing all the inline comments found in
   the RPSL text.

 "object"
   Stores a hash reference containing all the RPSL attributes found in the
   RPSL text.

 "omit_key"
   Stores an array reference containing all the position of the keys we
   must omit from the original RPSL text.

 "order"
   Stores an array reference containing an ordered list of RPSL attribute
   names, to enable the RPSL to be rebuilt from the parsed data version.

 "key"
   Stores the value found in the first RPSL attribute parsed. This is
   sometimes refered as the RPSL object key.

 "text"
   Stores an scalar containing the RPSL text to be parsed.

 "tokens"
   Stores an array reference containing an ordered list of tokens and token
   values produced by the tokenize method.

 "type"
   Stores a string representing the name of the first RPSL attribute found
   in the RPSL text parsed. The RFC 2622 requires that the first attribute
   declares the "data type" of the RPSL object declared.

Private Interface
 "_read_text( @input )"
   Checks if the first element from @input is a IO::Handle or a "GLOB", and
   reads from it. If the first element is not any type of file handle,
   assumes it's an array of scalars containing the text for the RPSL object
   to be parsed, "join()" it all toghether and feed it to the parser.

 "_tokenize()"
   This method breaks down the RPSL source code read by "read_text()" into
   tokens, and store them internally. For commodity, it returs a reference
   to the object itself, so you can chain up method calls.

 "_cleanup_attribute( $value )"
   Returns a cleaned-up version of the attribute passed in: no trailling or
   leading whitespace or newlines.

 "_store_attribute( $attribute_name, $attribute_value )"
   Auxiliary method. It clean up the value and store the attribute in the
   data structure being built, and does the necessary storage upkeep.

 "_store_comment( $comment_position_index, $attribute_and_comment_text )"
   This method extracts inline comments from the inline part of an object
   and store those comments into the parse tree being built. It returns the
   attribute passed in with the comments stripped, so it can be stored into
   the appropriated place afterwards.

 "_build_parse_tree()"
   This method consumes the tokens produced by "_tokenize()" and builds a
   data structure containing all the information needed to re-build the
   RPSL object back.

   It returns a reference to the parser object itself, making easy to chain
   method calls again.

 "_parse_tree()"
   This method assembles all the information gathered during the RPSL
   source code tokenization and parsing into a hash reference containing
   the following keys:

 data
   Holds a hash reference whose keys are the RPSL attributes found, and the
   values are the string passed in as values to the respective attributes
   in the RPSL text. Multi-valued attributes are represented by array
   references. As this parser doesn't enforces all the RPSL business rules,
   you must take care when fiddling with this structure, as any value could
   be an array reference.

 order
   Holds an array reference containing the key names from the data hash, in
   the order they where found in the RPSL text. This is stored here because
   the RFC 2622 commands that the order of the attributes in a RPSL object
   is important.

 type
   Holds a string containing the name of the first RPSL attribute found in
   the RPSL text. RFC 2622 commands that the first attribute must be the
   type of the object declared. Knowing the type of object can allow proper
   manipulation of the different RPSL object types by other RPSL namespace
   modules.

 key
   Holds the value contained by the first attribute of an RPSL object. This
   is sometimes the "primary key" of a RPSL object, but not always.

 comment
   Comment is a hash structure where the keys are index positions in the
   order array, and values are the inline comments extracted during the
   parsing stage. Preserving inline comments is not a requirement from RFC
   2622, just a nice thing to have.

 omit_key
   RFC 2622 allows some attribute names to contain multiple values. For
   every new value, a new line must be inserted into the RPSL object. For
   brevity, and to allow humans to read and write RPSL, the RFC 2622 allows
   the attribute name to be omited and replaced by whitespace. It also
   dictates that lines begining with a "+" sign must be considered as being
   part of a multi-line RPSL attribute.

   This array reference stores integers representing index positions in the
   order array signaling attribute positions that must be omited when
   generating RPSL text back from this parse tree. As RFC 2622 doesn't
   request that attributes omited by starting a line with whitespace or "+"
   must preserve this characteristic, this is only a nice-to-have feature.
   =back

EXAMPLES
 Example #1: retrieving information from the Whois Database
   Suppose you retrieve a RPSL Person object from the RIPE NCC WHOIS
   Database, like the one below. It's a simple query and I will not explain
   how you could do it here, because it's a bit out of scope for this
   module. Anyway, you have the following text as a result:

       person:       I. M. A. Fool
       address:      F.A.K.E Corporation
       address:      226 Nowhere st
       address:      10DD10 Nevercity
                     Neverland
       phone:        +99-99-999-9999
       fax-no:       +99-99-999-9999
       e-mail:       [email protected]
       nic-hdl:      XXX007-RIPE # Look, ma, I'm 007! ;)
       mnt-by:       NICE-GUY-MNT
       changed:      [email protected] 20001016
       source:       RIPE

   Let's assume you need to send an email (for example, to report routing
   problems) to Mr Fool. It means that you need to retrieve the *e-mail*
   field value from this RPSL object.

   Let's assume you have the previous text in the $text variable. In order
   to parse the contents of the text into a nice Perl data structure, all
   you need to do is instanciate a parser, with

       use RPSL::Parser;
       my $parser = new RPSL::Parser;

   And then pass it the contents of the $text variable, and collect the
   resulting data structure back:

       my $data_structure = $parser->parse( $text );

   It will give you something that will look more or less like this (dumped
   by Data::Dumper):

       $data_strucutre = {
           __META => {
               'omit_key' => [4],
               'comment'  => { 8 => q{Look, ma, I'm 007! ;)} },
               'order'    => [
                   'person',  'address', 'address', 'address',
                   'address', 'phone',   'fax-no',  'e-mail',
                   'nic-hdl', 'mnt-by',  'changed', 'source'
               ],
               'type' => 'person',
               'data' => {
                   'source'  => 'RIPE',
                   'mnt-by'  => 'NICE-GUY-MNT',
                   'phone'   => '+99-99-999-9999',
                   'nic-hdl' => 'XXX007-RIPE',
                   'fax-no'  => '+99-99-999-9999',
                   'e-mail'  => '[email protected]',
                   'changed' => '[email protected] 20001016',
                   'person'  => 'I. M. A. Fool',
                   'address' => [
                       'F.A.K.E Corporation',
                       '226 Nowhere st',
                       '10DD10 Nevercity',
                       'Neverland'
                   ]
               },
               'key' => 'I. M. A. Fool'
           },
       };

   In a near future, there will be other objects that will know how to
   interpret this as the specific RPSL object declared, and to write the
   corresponding RPSL representation of a given data structure, simmilar to
   this one.

SEE ALSO
   RFC2622 <http://www.ietf.org/rfc/rfc2622.txt>, for the full RPSL
   specification.

AUTHOR
   Luis Motta Campos, <[email protected]>

COPYRIGHT AND LICENSE
   Copyright (C) 2008 by Luis Motta Campos

   This library is free software; you can redistribute it and/or modify it
   under the same terms as Perl itself, either Perl version 5.8.8 or, at
   your option, any later version of Perl 5 you may have available.