XML::LibXML is a validating parser. You can use it to determine if the XML is valid.
use XML::LibXML qw( );
my $parser = XML::LibXML->new();
if (eval { $parser->parse_file($qfn) }) {
print "ok\n";
} else {
print "error:\n$@";
}
Automatically correcting XML is another matter. It's impossible to automatically fix bad XML without making huge assumptions. For example, there's no way to know whether
<foo>/bar<baz/</foo>
was meant to be
<foo>/bar<baz/</foo>
or
<foo>/bar<baz/></foo>
or even something else.
XML::LibXML does have an option to automatically fix/ignore some errors. Who knows if it makes the same assumption you do. Use
use XML::LibXML qw( );
my $parser = XML::LibXML->new( recover => $recover );
my $doc = $parser->parse_file($in_qfn);
$doc->toFile($out_qfn);
Use 1
for $recover
if you want the parser to be warn when it fixes a problem.
Use 2
for $recover
if you want the parser to fix problems silently.
No matter what you use for $recover
, it will still throw an exception if it encounters an unrecoverable error.