I'm new to using Perl XML::SAX and I encountered a problem with the characters event that is triggered. I'm trying to parse a very large XML file using perl.
My goal is to get the content of each tag (I do not know the tag names - given any xml file, I should be able to crack the record pattern and return every record with its data and tag like Tag:Data).
While working with small files, everything is ok. But when running on a large file, the characters{} event does partial reading of the content. There is no specific pattern in the way it cuts down the reading. Sometimes its the starting few characters of data and sometimes its last few characters and sometimes its just one letter from the actual data.
The Sax Parser is:
$myhandler = MyFilter->new();
$parser = XML::SAX::ParserFactory->parser(Handler => $myhandler);
$parser->parse_file($filename);
And, I have written my own Handler called MyFilter and overridding the character method of the parser.
sub characters {
my ($self, $element) = @_;
$globalvar = $element->{Data};
print "content is: $globalvar \n";
}
Even this print statement, reads the values partially at times. I also tried loading the Parsesr Package before calling the $parser->parse() as:
$XML::SAX::ParserPackage = "XML::SAX::ExpatXS";
Stil doesn't work. Could anyone help me out here? Thanks in advance!