NAME
Text::FixedLengthMultiline - Parse text data formatted in space
separated columns optionnaly on multiple lines
SYNOPSIS
use Text::FixedLengthMultiline;
#234567890 12345678901234567890 12
my $text = <<EOT;
Alice Pretty girl!
Bob Good old uncle Bob,
very old. 92
Charlie Best known as Waldo 14
or Wally. Where's
he?
EOT
my $fmt = Text::FixedLengthMultiline->new(format => ['!name' => 10, 1, 'comment~' => 20, 1, 'age' => -2 ]);
# Compute the RegExp that matches the first line
my $first_line_re = $fmt->get_first_line_re();
# Compute the RegExp that matches a continuation line
my $continue_line_re = $fmt->get_continue_line_re();
my @data;
my $err;
while ($text =~ /^([^\n]+)$/gm) {
my $line = $1;
push @data, {} if $line =~ $first_line_re;
if (($err = $fmt->parse_line($line, $data[$#data])) > 0) {
warn "Parse error at column $err";
}
}
DESCRIPTION
A row of data can be splitted on multiple lines of text with cell
content flowing in the same column space.
FORMAT SPECIFICATION
The format is given at the contruction time as an array ref. Modifying
the array content after the construction call is done at your own risks.
The array contains the ordered sequence of columns. Each colmun can
either be:
* a positive integer representing the size of a separating column
which is expected to always be filled with spaces.
* a string that matches this regexp:
/^(?#mandatory)!?(?#name)[:alnum:]\w*(?:(?#multi)~(?#cont).?)?$/
* `!' means the column is mandatory
* `name' is the column name. This will be the key for the hash
after parsing.
* `~' means the column data can be on multiple lines.
METHODS
new()
Arguments:
* `format': an array reference following the FORMAT SPECIFICATION.
* `debug'
Example:
my $format = Text::FixedLengthMultiline->new(format => [ 2, col1 => 4, 1, '!col2' => 4 ]);
`parse_table($text)'
Parse a table.
my @table = $fmt->parse_table($text);
Returns an array of hashes. Each hash is a row of data.
`parse_line($line, $hashref)'
Parse a line of text and add parsed data to the hash.
my $error = $fmt->parse_line($line, \%row_data);
Multiple calls to `parse_line()' with the same hashref may be needed to
fully read a "logical line" in case some columns are multiline.
Returns:
* `-col': Parse error. The value is a negative integer indicating the
character position in the line where the parse error occured.
* `0': OK
* `col': Missing data: need to feed next line to fill remining
columns. The value is the character position of the column where
data is expected.
`get_first_line_re()'
Returns a regular expression that matches the first line of a "logical
line" of data.
my $re = $fmt->get_first_line_re();
`get_continue_line_re()'
Returns a regular expression that matches the 2nd line and the following
lines of a "logical line".
my $re = $fmt->get_continue_line_re();
Returns undef if the format specification does not contains any column
that can be splitted on multiples lines.
TODO
* `format()'
* `to_sprintf()'
* See TODO sections in tests bundled with the distribution.
BUGS
* This module should have been named Text::FixedLengthMultilineFormat,
but the current name is already long enough!
SUPPORT
You can look for information at:
* RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Text-FixedLengthMultiline:
post bug report there.
* CPAN Ratings
http://cpanratings.perl.org/p/Text-FixedLengthMultline: if you use
this distibution, please add comments on your experience for other
users.
* Search CPAN
http://search.cpan.org/dist/Text-FixedLengthMultiline/
* AnnoCPAN: Annotated CPAN documentation
http://annocpan.org/dist/Text-FixedLengthMultiline
LICENSE
Copyright (c) 2005-2010 Olivier Mengu�. All rights reserved.
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
AUTHOR
Olivier Mengu�, <
[email protected]>
SEE ALSO
Related modules I found on CPAN:
* Text::FormatTable
* Text::Table
* Text::FixedLength
* Text::FixedLength::Extra
* Text::Column