Does there exist an equivalent for the method dispose
of XML::DOM
in XML::LibXML?
I have to parse many XML files and I don't want to have memory problems.
Does there exist an equivalent for the method dispose
of XML::DOM
in XML::LibXML?
I have to parse many XML files and I don't want to have memory problems.
I can't speak for XML::LibXML
but XML::Twig
definitely has a purge
method. This is useful for large XML documents so you can discard 'processed' data that you may have already handled.
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
sub print_and_purge {
my ( $twig, $element ) = @_;
##do something with this element
print "-----\n";
$element -> print;
print "-----\n";
$twig -> purge;
}
my $twig = XML::Twig -> new ( 'twig_handlers' => { 'element' => \&print_and_purge },
'pretty_print' => 'indented' );
$twig -> parse ( \*DATA );
__DATA__
<root>
<element>fish</element>
<element>
<subelement>content</subelement>
</element>
<element/>
</root>
Note - this is an illustrative example - it's not a particularly useful thing to do, because you're printing the 'elements' and discarding the 'root'. But it's more useful if you're extracting information from your XML.
Also of interest is the flush
method, which is probably more relevant if you're doing the above:
Flushes a twig up to (and including) the current element, then deletes all unnecessary elements from the tree that's kept in memory. flush keeps track of which elements need to be open/closed, so if you flush from handlers you don't have to worry about anything. Just keep flushing the twig every time you're done with a sub-tree and it will come out well-formed. After the whole parsing don't forget toflush one more time to print the end of the document. The doctype and entity declarations are also printed.
There's no mechanism to explicitly release memory because it automatically happens as soon as possible. Resources are automatically freed as references to them are relinquished. Use properly-scoped variables, and you'll be ok.
If you actually do want to trim a document tree (though I don't see why), you can use the following to remove a node from the tree:
$node->parentNode->removeChild($node);
The node (and its children) will be freed when the last reference to it goes away. Usually, this is as soon as $node
goes out of scope.
You can simply call $node->unbindNode
and drop all references to the node and any of its children, child attributes, etc. Then the internal xmlNode
will be destroyed.