NAME

NAME
Test::XML::Easy - test XML with XML::Easy

SYNOPSIS
use Test::More tests => 2;
use Test::XML::Easy;

is_xml $some_xml, <<'ENDOFXML', "a test";
<?xml version="1.0" encoding="latin-1">
<foo>
<bar/>
<baz buzz="bizz">fuzz</baz>
</foo>
ENDOFXML

is_xml $some_xml, <<'ENDOFXML', { ignore_whitespace => 1, description => "my test" };
<foo>
<bar/>
<baz buzz="bizz">fuzz</baz>
</foo>
ENDOFXML

isnt_xml $some_xml, $some_xml_it_must_not_be;

is_well_formed_xml $some_xml;

DESCRIPTION
A simple testing tool, with only pure Perl dependancies, that checks if
two XML documents are "the same". In particular this module will check
if the documents schemantically equal as defined by the XML 1.0
specification (i.e. that the two documents would construct the same DOM
model when parsed, so things like character sets and if you've used two
tags or a self closing tags aren't important.)

This modules is a strict superset of Test::XML's interface, meaning if
you were using that module to check if two identical documents were the
same then this module should function as a drop in replacement. Be
warned, however, that this module by default is a lot stricter about how
the XML documents are allowed to differ.

Functions

This module, by default, exports a number of functions into your
namespace.

is_xml($xml_to_test, $expected_xml[, $options_hashref])
Tests that the passed XML is "the same" as the expected XML.

XML can be passed into this function in one of two ways; Either you
can provide a string (which the function will parse for you) or you
can pass in XML::Easy::Element objects that you've constructed
yourself somehow.

This funtion takes several options as the third argument. These can
be passed in as a hashref:

description
The name of the test that will be used in constructing the `ok'
/ `not ok' test output.

ignore_whitespace
Ignore many whitespace differences in text nodes. Currently this
has the same effect as turning on
`ignore_surrounding_whitespace' and
`ignore_different_whitespace'.

ignore_surrounding_whitespace
Ignore differences in leading and trailing whitespace between
elements. This means that

foo bar baz

Is considered the same as


foo bar baz


And even


this is my cat:<img src="http://myfo.to/KsSc.jpg" />


Is considered the same as:


this is my cat: <img src="http://myfo.to/KsSc.jpg" />


Even though, to a web-browser, that extra space is significant
whitespace and the two documents would be renderd differently.

However, as comments are completely ignored (we treat them as if
they were never even in the document) the following:

foobar

would be considered different to


foo

bar


As it's the same as comparing the string

"foobar"

And:

"foo

bar"

The same is true for processing instructions and DTD
declarations.

ignore_leading_whitespace
The same as `ignore_surrounding_whitespace' but only ignore the
whitespace immediately after an element start or end tag not
immedately before.

ignore_trailing_whitespace
The same as `ignore_surrounding_whitespace' but only ignore the
whitespace immediately before an element start or end tag not
immedately after.

ignore_different_whitespace
If set to a true value ignores differences in what characters
make up whitespace in text nodes. In other words, this option
makes the comparison only care that wherever there's whitespace
in the expected XML there's any whitespace in the actual XML at
all, not what that whitespace is made up of.

It means the following


foo bar baz


Is the same as


foo
bar
baz


But not the same as


foobarbaz


This setting has no effect on attribute comparisons.

verbose
If true, print obsessive amounts of debug info out while
checking things

show_xml
This prints out in the diagnostic messages the expected and
actual XML on failure.

If a third argument is passed to this function and that argument is
not a hashref then it will be assumed that this argument is the the
description as passed above. i.e.

is_xml $xml, $expected, "my test";

is the same as

is_xml $xml, $expected, { description => "my test" };

isnt_xml($xml_to_test, $not_expected_xml[, $options_hashref])
Exactly the same as `is_xml' (taking exactly the same options) but
passes if and only if what is passed is different to the not
expected XML.

By different, of course, we mean schematically different according
to the XML 1.0 specification. For example, this will fail:

isnt_xml "<foo/>", "<foo></foo>";

as those are schematically the same XML documents.

However, it's worth noting that the first argument doesn't even have
to be valid XML for the test to pass. Both these pass as they're not
schemantically identical to the not expected XML:

isnt_xml undef, $not_expecteded_xml;
isnt_xml "<foo>", $not_expected_xml;

as invalid XML is not ever schemanitcally identical to a valid XML
document.

If you want to insist what you pass in is valid XML, but just not
the same as the other xml document you pass in then you can use two
tests:

is_well_formed_xml $xml;
isnt_xml $xml, $not_expected_xml;

This function accepts the `verbose' option (just as `is_xml' does)
but turning it on doesn't actually output anything extra - there's
not useful this function can output that would help you diagnose the
failure case.

is_well_formed_xml($string_containing_xml[, $description])
Passes if and only if the string passed contains well formed XML.

isnt_well_formed_xml($string_not_containing_xml[, $description])
Passes if and only if the string passed does not contain well formed
XML.

A note on Character Handling

If you do not pass it an XML::Easy::Element object then these functions
will happly parse XML from the characters contained in whatever scalars
you passed in. They will not (and cannot) correctly parse data from a
scalar that contains binary data (e.g. that you've sucked in from a raw
file handle) as they would have no idea what characters those octlets
would represent

As long as your XML document contains legal characters from the ASCII
range (i.e. chr(1) to chr(127)) this distintion will not matter to you.

However, if you use characters above codepoint 127 then you will
probably need to convert any bytes you have read in into characters.
This is usually done by using `Encode::decode', or by using a PerlIO
layer on the filehandle as you read the data in.

If you don't know what any of this means I suggest you read the
Encode::encode manpage very carefully. Tom Insam's slides at
http://jerakeen.org/talks/perl-loves-utf8/ may or may not help you
understand this more (they at the very least contain a cheatsheet for
conversion.)

The author highly recommends those of you using latin-1 characters from
a utf-8 source to use Test::utf8 to check the string for common mistakes
before handing it `is_xml'.

AUTHOR
Mark Fowler, `<[email protected]>'

Copyright 2009 PhotoBox, All Rights Reserved.

This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.

BUGS
There's a few cavets when using this module:

Not a validating parser
Infact, we don't process (or compare) DTDs at all. These nodes are
completely ignored (it's as if you didn't include them in the string
at all.)

Comments and processing instructions are ignored
We totally ignore comments and processing instructions, and it's as
if you didn't include them in the string at all either.

Limited entity handling
We only support the five "core" named entities (i.e. `&',
`<', `>', `'' and `"') and numerical character
references (in decimal or hex form.) It is not possible to declare
further named entities and the precence of undeclared named entities
will either cause an exception to be thrown (in the case of the
expected string) or the test to fail (in the case of the string you
are testing)

No namespace support
Currently this is only an XML 1.0 parser, and not XML Namespaces
aware (further options may be added to later version of this module
to enable namespace support)

This means the following document:

<foo:fred xmlns:foo="http://www.twoshortplanks.com/namespaces/test/fred" />

Is considered to be different to

<bar:fred xmlns:bar="http://www.twoshortplanks.com/namespaces/test/fred" />

XML whitespace handling
This module considers "whitespace" to be what the XML specification
considers to be whitespace. This is subtily different to what Perl
considers to be whitespace.

No node reordering support
Unlike Test::XML this module considers the order of sibling nodes to
be significant, and you cannot tell it to ignore the differring
order of nodes when comparing the expected and actual output.

Please see http://twoshortplanks.com/dev/testxmleasy for details of how
to submit bugs, access the source control for this project, and contact
the author.

SEE ALSO
Test::More (for instructions on how to test), XML::Easy (for info on the
underlying xml parser) and Test::XML (for a similar module that tests
using XML::SchemanticDiff)