0

Does there exist an equivalent for the method dispose of XML::DOM in XML::LibXML?

I have to parse many XML files and I don't want to have memory problems.

Kevin Panko
  • 8,356
  • 19
  • 50
  • 61
Eranos
  • 1
  • There should be no need to explicitly dispose of your XML document. Have you tried it? There's no point in fixing imaginary problems – Borodin Jun 05 '15 at 10:12

3 Answers3

0

I can't speak for XML::LibXML but XML::Twig definitely has a purge method. This is useful for large XML documents so you can discard 'processed' data that you may have already handled.

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

sub print_and_purge {
   my ( $twig, $element ) = @_; 
    ##do something with this element

   print "-----\n";
   $element -> print;
   print "-----\n";

   $twig -> purge; 
}

my $twig = XML::Twig -> new ( 'twig_handlers' => { 'element' => \&print_and_purge },
                              'pretty_print' => 'indented' ); 

 $twig -> parse ( \*DATA );

 __DATA__
 <root>
     <element>fish</element>
     <element>
        <subelement>content</subelement>
     </element>
     <element/>
 </root>

Note - this is an illustrative example - it's not a particularly useful thing to do, because you're printing the 'elements' and discarding the 'root'. But it's more useful if you're extracting information from your XML.

Also of interest is the flush method, which is probably more relevant if you're doing the above:

Flushes a twig up to (and including) the current element, then deletes all unnecessary elements from the tree that's kept in memory. flush keeps track of which elements need to be open/closed, so if you flush from handlers you don't have to worry about anything. Just keep flushing the twig every time you're done with a sub-tree and it will come out well-formed. After the whole parsing don't forget toflush one more time to print the end of the document. The doctype and entity declarations are also printed.

Sobrique
  • 52,974
  • 7
  • 60
  • 101
0

There's no mechanism to explicitly release memory because it automatically happens as soon as possible. Resources are automatically freed as references to them are relinquished. Use properly-scoped variables, and you'll be ok.


If you actually do want to trim a document tree (though I don't see why), you can use the following to remove a node from the tree:

$node->parentNode->removeChild($node);

The node (and its children) will be freed when the last reference to it goes away. Usually, this is as soon as $node goes out of scope.

ikegami
  • 367,544
  • 15
  • 269
  • 518
0

You can simply call $node->unbindNode and drop all references to the node and any of its children, child attributes, etc. Then the internal xmlNode will be destroyed.

nwellnhof
  • 32,319
  • 7
  • 89
  • 113