NAME
Data::Checker - a framework for checking data validity
SYNOPSIS
use Data::Checker;
$obj = new Data::Checker;
DESCRIPTION
A commonly performed task is to have a set of data that you want to
validate. Given a set of elements, you want to test each to make sure
that it is valid, and then break the set into two groups: the group that
is valid, and the group that is not. With the group that is not valid,
you usually want an error message associated with that element so you
can see why it is not valid.
Although this is an extremely common task, there isn't a convenient
framework for expressing these tests in, which means that every time you
want to do this kind of testing, you not only have to write the
functions for doing the tests, you also have to write the entire
framework as well.
This module was written to provide the framework around the tests. A
number of common test functions are provided, or you can write your own,
and the framework will take care of the rest.
The framework includes the following commonly desired functionality:
Automatic handling of the testing
A list of elements is passed in to the framework, and it will
automatically apply the tests and split the elements into sets of
passing and failing elements.
Running tests in parallel
Many times, testing a piece of data may take a significant amount of
time, and running them in parallel can greatly speed up the process.
This framework allows you to run any number of the tests in
parallel, or you can run them serially one at a time.
Support for warnings and information messages
Sometimes you want some tests to produce warnings or just
informational messages for an element, but to still consider them as
having passed the test.
The level for each test can be specified so that a failure produces
an informational notice, a warning, or an error. Only an error means
that the element fails the test.
BASE METHODS
new
$obj = new Data::Checker;
This creates a new data check framework.
version
$vers = $obj->version;
This returns the version of this module.
parallel
$obj->parallel($n);
In many cases, tests can be run in parallel to speed things up. By
default, all tests will be run serially (one at a time) but that
behavior can be changed using this method. $n must be a positive
integer (or zero):
$n=1 All tests are run serially. This is the default.
$n>1 $n tests will run at a time. If there are more
elements than this, one will have to finish before
another will start.
$n=0 All of the elements will be tested at the same time.
check
($pass,$fail,$warn,$info) = $obj->check($data,$check,$check_opts);
This is the function which actually performs the checks. It takes a
set of elements ($data) and verifies them using the checks specified
by $check and $check_opts. It returns a list of elements that pass
the check and a list that fail the check. In addition, informational
notes and warnings about the elements may also be returned.
The data is passed in as a single data structure ($data) as
described below in the "SPECIFYING DATA" section.
$check specifies what function to use to perform a check. It will be
used to test an individual element to determine whether it passes or
fails a check. This is described below in the "CHECK FUNCTION"
section.
$check_opts is a hashref that contains options specifying exactly
how the check is to be performed, and it will be passed to the check
function. This is described more fully below in the "CHECK OPTIONS"
section.
SPECIFYING DATA
Data is passed in to the check method in one of two forms.
The simplest form is a listref. For example:
$data = [ 'cow', 'horse', 'zebra', 'oak' ]
Many tests do not require any more than this. For these, elements that
pass are returned also as a listref. Order is NOT preserved in the
output ($pass and $fail).
Some tests however rely on a description of each element, and for these,
the data is passed in as a hashref where each key is one data element
and the value is a description of the elements (which will typically be
a hashref, but might be a scalar, a listref, or some other type of
description, and will be documented with the function doing the check).
For example:
$data = { 'apple' => { 'type' => 'fruit',
'color' => 'red' },
'pear' => { 'type' => 'fruit',
'color' => 'yellow' },
'bean' => { 'type' => 'vegetable',
'color' => 'green' }
}
As mentioned, the exact form of the description will be documented with
the function that is used to do the checks.
When data is passed in as a hashref, the list of elements that pass is
also a hashref with the description fully preserved.
CHECK FUNCTION
All checks are performed by a function which takes a single element and
tests it to see if it passes. It may perform only a single check on an
element, or multiple checks.
All check functions take the same set of arguments, and all return the
same set of values.
The $check argument in the check method provides a pointer to where the
check function can be found.
$check can be a coderef, in which case you are passing the check
function in directly. Alternately, $check can be a string naming the
check function, or the namespace where it is found.
If $check is a string, the Data::Checker framework will look for a check
function based on that string. As an example, if $check is Foo::Bar, the
following locations will be examined to see if they are a function:
Foo::Bar
Foo::Bar::check
CALLER::Foo::Bar
CALLER::Foo::Bar::check
Data::Checker::Foo::Bar
Data::Checker::Foo::Bar::check
where CALLER is the package of the calling routine. The first one which
refers to a function will be used. The appropriate module will be loaded
as necessary.
A check function is always called as follows:
($element,$err,$warn,$info) =
FUNCTION($obj,$element,$description,$check_opts);
The arguments to the check function are:
$obj
$obj is the Data::Checker object that was created, and is passed in
to provide the check function some useful methods provided by the
framework. These functions are described below in the "CHECK
FUNCTION METHODS"
$element, $description
$element is the element being tested, and $description is the
description of that element.
If the list of elements was specified as a listref, $element will be
one value from that listref and $description will be undef.
If the list of elements was specified as a hashref, $element will be
one of the keys from that hashref and $description will be the value
of that key.
$check_opts
$check_opts is the hashref that was passed in to the check method
and is described in the "CHECK OPTIONS" section.
The check function always returns the following values:
$element
This is the element that was passed in as an argument. It must be
returned so that when parallel testing is done, the parent can
easily determine which element was being checked by a finished
thread.
$err
This is a listref of error messages. If this is undefined or empty,
then the element passed the test.
$info, $warn
These are listrefs of informational messages and warnings about this
element. These are optional.
CHECK OPTIONS
Options may be passed in to the check function as a hashref. The form of
the hashref (what keys/values are allowed) is documented with the check
function, but the general form is:
$check_opts = { GLOBAL_OPT_1 => GLOBAL_VAL_1,
GLOBAL_OPT_2 => GLOBAL_VAL_2, ...
CHECK_A => { OPT_A1 => VAL_A1,
OPT_A2 => VAL_A2, ... }
CHECK_B => { OPT_A1 => VAL_A1,
OPT_A2 => VAL_A2, ... }
... }
There are two types of keys in $check_opts: ones which sets global
options (which apply to all possible checks that could be done), and
ones which define exactly what types of checks are performed and options
that apply only to that check.
All check specific options will override a global option.
The following options are standard:
level => err, warn, info
The level option (which defaults to 'err') can be set to 'warn' or
'info'. If it is, then any element which fails this check will
produce the appropriate type of message. It will only result in a
failure if it is set to 'err'.
negate => 1
The negate option can be set to negate the test (i.e. what would
have been deemed a success it actually a failure and vice versa.
message => [ STRING, STRING, ... ]
The message to return if a check fails. The string __ELEMENT__ will
be replaced by the element being checked.
For example, doing DNS checks, you might want to specify exactly what
server to use, and you might want to check that a host is defined in DNS
(and produce an error if it is not), and warn if it does not have a
unique IP. This might be done by passing in:
$check_opts = { 'nameservers' => 'my.dns.server',
'dns' => undef,
'unique' => { 'level' => 'warn' } }
CHECK FUNCTION METHODS
In addition to the base methods listed above, the Data::Checker object
also includes the following methods which are intended to be called
inside a check function.
check_performed
$flag = $obj->check_performed($check_opts,$label);
This checks $check_opts for the existance of a key named $label
indicating that that check should be performed.
check_level
$level = $obj->check_level($check_opts [,$label]);
Check to see what level ('err', 'info', or 'warn') the check is. If
a check is 'err' level, then if it fails, it produces an error. If
it is 'warn' level, it produces a warning, but the check is marked
as a passing. If it is 'info', then if the check fails, it produces
an informational message, but the check is marked as passing.
check_option
$val = $obj->check_option($check_opts,$opt [,$default [,$label]]);
This returns the value of the given option ($opt) for this check
($label).
If the option is not found, $default is returned (if it is given).
check_message
$obj->check_message($check_opts,$label,$element,$default_message,
$level,$err,$warn,$info);
This produces a message indicating that the check failed and stores
it in the appropriate listref.
If the 'message' option is available, that message is used.
Otherwise, $default_message will be used.
The message can be a string or a listref (a multi-line message). The
string __ELEMENT__ will be replaced by the element being tested.
check_value
$obj->check_value($check_opts,$label,$element,$value,
$std_fail,$negate_fail,
$err,$warn,$info);
This will test to see if a check passed or failed. It takes a value
($value) and if it evaluates to true, then by default the check
passes (unless the 'negate' option is present in which case it
fails).
The $std_fail is a message (either a string or a listref of strings)
that will be given when the check fails and the 'negate' option is
not set. $negate_fail is a similar message that will be given when
the check fails but the 'negate' option is set.
$err, $warn, and $info are listrefs containing the messages.
If $err is non-empty, an error has occurred.
If the $negate_fail empty is empty, the 'negate' option will be
ignored. This is typically used to test an element to see if it is
the right type of data for this check. If it isn't, other types of
checks are typically not able to run.
If $label is empty, the test is always performed.
KNOWN BUGS AND LIMITATIONS
None known.
SEE ALSO
Data::Checker::DNS
Predefined DNS checks.
Data::Checker::Ping
Predefined checks to see if a host reponds to pings.
Data::Checker::IP
Predefined checks to see if an element is a valid IP.
LICENSE
This script is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
AUTHOR
Sullivan Beck (
[email protected])