Why is Moose code so slow?

Question

I'm trying to parse a large XML file. I read it using XML::SAX (using Expat, not the perl implementation) and put all the second level and below nodes into my "Node" class:

package Node;
use Moose;

has "name" =>
(
    isa =>  "Str",
    reader  => 'getName'
);

has "text" =>
( 
  is    =>  "rw",
  isa   =>  "Str"
);

has "attrs" =>
(
    is  =>  "rw",
    isa =>  "HashRef[Str]"

);

has "subNodes"  =>
(
    is  =>  "rw",
    isa =>  "ArrayRef[Node]",
    default => sub { [] }
);

sub subNode
{   
  my ($self, $name) = @_;
  my $subNodeRef = $self->subNodes;
  my @matchingSubnodes = grep { $_->getName eq $name } @$subNodeRef;

  if (scalar(@matchingSubnodes) == 1)
  {

    return $matchingSubnodes[0];
  }
  return undef;
}


1;

In the "end_element" sub, I check if this is a node I care about, and if it is, I do some further processing.

This all worked fine on my test files, but the day before yesterday I threw it at my real file, all 13 million lines of it, and it's taking forever. It's been running for over 36 hours. How do I tell if it's Moose or XML::SAX that's the bottleneck? Is Moose always this slow, or am I using it wrong?

Update Doing a profile on a 20,000 line subset of the data shows that it is Moose that's the bottleneck - specifically in Class::MOP::Class::compute_all_applicable_attributes (13.9%) and other Class and Moose classes.

`Class::MOP::Class::compute_all_applicable_attributes` is something that needs to be done a lot if you don't `__PACKAGE__->meta->make_immutable` your classes like I suggested in my answer. How does the profile change if you do that? — rafl, Oct 12 '10 at 14:02
@Ether, that one was about start-up cost. Mine is about running cost, especially object creation and destruction. — Paul Tomblin, Oct 12 '10 at 15:17
try installing Mouse and see if s/Moose/Mouse/ helps your speed significantly. — MkV, Oct 12 '10 at 17:01
I would love to have an example of your XML data to test this with, I do something similar in XML::Toolkit (using the bundled XML::Filter::Moose) and would be curious to profile it. — perigrin, Oct 12 '10 at 18:37
the code you have shown is fairly short, why not rewrite it using traditional perl5 objects and see how that changes things? write it in a very trusting way, not checking any type constraints or using any defensive programming practices. that will give you an upper bound on the speeds you can hope to get out of perl's object model — Eric Strom, Oct 13 '10 at 00:36

score 22 · Accepted Answer · edited Oct 12 '10 at 14:40

22

While Moose does quite some work at startup time, which sometimes makes it appear a little slow, the code it generates, especially things like accessors for attributes, are generally quite a bit faster than what the average perl programmer would be able to write. So given the runtime of your process is quite long, I doubt any overhead induced by Moose will be relevant.

However, from the code you've shown, I can't really tell what you're bottleneck is, even though I firmly believe it isn't Moose. I also want to point out that doing __PACKAGE__->meta->make_immutable, to state that you're class is now "finalised" allows Moose to do some further optimisations, but still I doubt this is what's causing you trouble.

How about you take a smaller sample of your data, so your program will finish in a reasonable time, and have a look at that in a profiler such as Devel::NYTProf. That'll be able to tell you where exactly the time in your program is spent, so you can optimise specifically those parts to get the greatest possible gain.

One possibility is that the type constraints you're using slow things down quite a bit. Actually validating instance attributes as thoroughly on every single write access to them (or on class instanciation), isn't something most programmers would usually do. You could try using simpler constraints, such as ArrayRef instead of ArrayRef[Node], if you're certain enough about the validity of your data. That way, only the type of the attribute value itself will be checked, not the value of every element in that array reference.

But still, profile your code. Don't guess.

edited Oct 12 '10 at 14:40

brian d foy

129,424
31
207
592

answered Oct 12 '10 at 13:25

rafl

11,980
2
55
77

Is NYTProf that much better than DProf? – Paul Tomblin Oct 12 '10 at 13:41
2

The `__PACKAGE__->meta->make_immutable` thing made my subset go from 11 seconds down to 6 seconds. Now the profiler says it's spending 26% of its time in `Moose::Meta::TypeConstraint::ArrayRef[Node]` so I'm going to try your suggestion to loosen the constraint on that next. – Paul Tomblin Oct 12 '10 at 14:14
Loosening the constraints didn't have any appreciable difference - my 20,000 line test file went from 6.4 seconds to 6.3 seconds. – Paul Tomblin Oct 12 '10 at 15:17
@Paul: what does the time breakdown look like now? it should be mostly in XML processing now, not in Moose innards. – Ether Oct 12 '10 at 18:59
`dprofpp -I -u` shows 64% in XML::SAX::Expat::_handle_start, of which 51% is in EADHandler::start_element (where I "new" the Node class), 26% is Moose::Meta::TypeConstrant::ArrayRef[Node]. – Paul Tomblin Oct 12 '10 at 19:09
@Paul: how does the percentage change if you switch to using a type constraint of just `ArrayRef`? – Ether Oct 12 '10 at 19:12
@Ether, almost exactly the same run time, but now it's 43% in `XML::SAX::Expat::_handle_start`, 26.5% in `EADHandler::start_element` and only 16.8% in `Node::new`. – Paul Tomblin Oct 12 '10 at 19:28
1

See also: ttp://stackoverflow.com/questions/3162390/is-moose-really-this-slow/3162676 – Ether May 17 '11 at 22:36

score 7 · Answer 2 · answered Oct 12 '10 at 19:40

7

I highly suspect that your speed problem is not in Moose so much as it is in memory allocation and disk swapping. Even without doing ->meta->make_immutable, based on your times for the 20K subset, your script should finish in about 2 hours (((11 * (13_000_000 / 20_000)) / 60) == ~119 min). By doing ->meta->make_immutable it would have cut it down to approx. 65 min or so.

Try running your big script again and see what your memory and swap are doing, I suspect your giving your disk an awful thrashing.

answered Oct 12 '10 at 19:40

Stevan Little

389
2
6

Munin says I was hardly swapping at all during the first 36 hour run. See http://xcski.com/munin/xcski.com/allhats2.xcski.com-cpu.html, http://xcski.com/munin/xcski.com/allhats2.xcski.com-memory.html and http://xcski.com/munin/xcski.com/allhats2.xcski.com-swap.html from the evening of 10Oct to the mid morning of 12Oct. You can see more swap in use during the second run (starting about 10am on 12Oct) after I did make_immutable, but that may be related to other things I was doing at the same time. – Paul Tomblin Oct 12 '10 at 20:10

score 2 · Answer 3 · answered Oct 12 '10 at 14:08

I have successfully written large XML processing apps using XML::Twig 745mb file take less then an hour to run on a reasonably sized box.

But as other users have already mentioned you need to profile your code to figure out what exactly is causing the issue.

Why is Moose code so slow?

3 Answers3