NAME
   YAPE::Regex - Yet Another Parser/Extractor for Regular
   Expressions

SYNOPSIS
     use YAPE::Regex;
     use strict;

     my $regex = qr/reg(ular\s+)?exp?(ression)?/i;
     my $parser = YAPE::Regex->new($regex);

     # here is the tokenizing part
     while (my $chunk = $parser->next) {
       # ...
     }

`YAPE' MODULES
   The `YAPE' hierarchy of modules is an attempt at a unified means
   of parsing and extracting content. It attempts to maintain a
   generic interface, to promote simplicity and reusability. The
   API is powerful, yet simple. The modules do tokenization (which
   can be intercepted) and build trees, so that extraction of
   specific nodes is doable.

DESCRIPTION
   This module is yet another (?) parser and tree-builder for Perl
   regular expressions. It builds a tree out of a regex, but at the
   moment, the extent of the extraction tool for the tree is quite
   limited (see the section on "Extracting Sections"). However, the
   tree can be useful to extension modules.

USAGE
   In addition to the base class, `YAPE::Regex', there is the
   auxiliary class `YAPE::Regex::Element' (common to all `YAPE'
   base classes) that holds the individual nodes' classes. There is
   documentation for the node classes in that module's
   documentation.

 Methods for `YAPE::Regex'

   * `use YAPE::Regex;'
   * `use YAPE::Regex qw( MyExt::Mod );'
       If supplied no arguments, the module is loaded normally, and
       the node classes are given the proper inheritence (from
       `YAPE::Regex::Element'). If you supply a module (or list of
       modules), `import' will automatically include them (if
       needed) and set up *their* node classes with the proper
       inheritence -- that is, it will append `YAPE::Regex' to
       `@MyExt::Mod::ISA', and `YAPE::Regex::xxx' to each node
       class's `@ISA' (where `xxx' is the name of the specific node
       class).

         package MyExt::Mod;
         use YAPE::Regex 'MyExt::Mod';

         # does the work of:
         # @MyExt::Mod::ISA = 'YAPE::Regex'
         # @MyExt::Mod::text::ISA = 'YAPE::Regex::text'
         # ...

   * `my $p = YAPE::Regex->new($REx);'
       Creates a `YAPE::Regex' object, using the contents of `$REx'
       as a regular expression. The `new' method will *attempt* to
       convert `$REx' to a compiled regex (using `qr//') if `$REx'
       isn't already one. If there is an error in the regex, this
       will fail, but the parser will pretend it was ok. It will
       then report the bad token when it gets to it, in the course
       of parsing.

   * `my $text = $p->chunk($len);'
       Returns the next `$len' characters in the input string;
       `$len' defaults to 30 characters. This is useful for
       figuring out why a parsing error occurs.

   * `my $done = $p->done;'
       Returns true if the parser is done with the input string,
       and false otherwise.

   * `my $errstr = $p->error;'
       Returns the parser error message.

   * `my $backref = $p->extract;'
       Returns a code reference that returns the next back-
       reference in the regex. For more information on enhancements
       in upcoming versions of this module, check the section on
       "Extracting Sections".

   * `my $node = $p->display(...);'
       Returns a string representation of the entire content. It
       calls the `parse' method in case there is more data that has
       not yet been parsed. This calls the `fullstring' method on
       the root nodes. Check the `YAPE::Regex::Element' docs on the
       arguments to `fullstring'.

   * `my $node = $p->next;'
       Returns the next token, or `undef' if there is no valid
       token. There will be an error message (accessible with the
       `error' method) if there was a problem in the parsing.

   * `my $node = $p->parse;'
       Calls `next' until all the data has been parsed.

   * `my $node = $p->root;'
       Returns the root node of the tree structure.

   * `my $state = $p->state;'
       Returns the current state of the parser. It is one of the
       following values: `alt', `anchor', `any', `backref',
       `capture(N)', `Cchar', `class', `close', `code', `comment',
       `cond(TYPE)', `ctrl', `cut', `done', `error', `flags',
       `group', `hex', `later', `lookahead(neg|pos)',
       `lookbehind(neg|pos)', `macro', `named', `oct', `slash',
       `text', and `utf8hex'.

       For `capture(N)', *N* will be the number the captured
       pattern represents.

       For `cond(TYPE)', *TYPE* will either be a number
       representing the back-reference that the conditional depends
       on, or the string `assert'.

       For `lookahead' and `lookbehind', one of `neg' and `pos'
       will be there, depending on the type of assertion.

   * `my $node = $p->top;'
       Synonymous to `root'.

 Extracting Sections

   While extraction of nodes is the goal of the `YAPE' modules, the
   author is at a loss for words as to what needs to be extracted
   from a regex. At the current time, all the `extract' method does
   is allow you access to the regex's set of back-references:

     my $extor = $parser->extract;
     while (my $backref = $extor->()) {
       # ...
     }

   `japhy' is very open to suggestions as to the approach to node
   extraction (in how the API should look, in addition to what
   should be proffered). Preliminary ideas include extraction
   keywords like the output of -Dr (or the `re' module's `debug'
   option).

EXTENSIONS
   * `YAPE::Regex::Explain' 3.00
       Presents an explanation of a regular expression, node by
       node.

   * `YAPE::Regex::Reverse' (Not released)
       Reverses the nodes of a regular expression.

TO DO
   This is a listing of things to add to future versions of this
   module.

 API

   * Create a robust `extract' method
       Open to suggestions.

BUGS
   Following is a list of known or reported bugs.

 Pending

   * `use charnames ':full''
       To understand `\N{...}' properly, you must be using 5.6.0 or
       higher. However, the parser only knows how to resolve full
       names (those made using `use charnames ':full''). There
       might be an option in the future to specify a class name.

SUPPORT
   Visit `YAPE''s web site at http://www.pobox.com/~japhy/YAPE/.

SEE ALSO
   The `YAPE::Regex::Element' documentation, for information on the
   node classes. Also, `Text::Balanced', Damian Conway's excellent
   module, used for the matching of `(?{ ... })' and `(??{ ... })'
   blocks.

AUTHOR
     Jeff "japhy" Pinyan
     CPAN ID: PINYAN
     [email protected]
     http://www.pobox.com/~japhy/