5

I have implemented a simple iterator in perl. I normally work with C#, and use iterators and functional programming quite frequently. So I thought it would be simple to get some basics working in perl.

Problem is, I'm getting some poor performance, I don't expect be any faster than for or foreach, but I thought someone could give me some insight in how to speed it up.

Here is the guts of my package:

package Iterator;
use strict;

#Constructor for Iterator type
sub new {
  my $type = $_[0];
  my $this = {};

  #set default values
  $this->{Array} = @_[1];
  $this->{Index} = 0;
 $this->{Sub} = sub { 
   my @array = @{$this->{Array}};
   return $#array >= $this->{Index} ? $array[$this->{Index}++] : undef;
  };

  #return self
  bless($this, $type);
  return $this;
}

#Iterates next
sub Next {
 return $_[0]->{Sub}->();
}

Allows you to do this:

my $iterator = Iterator->new(\@array);
while (defined(my $current = $iterator->Next())) {
  print $current, "\n";
}

Not flashy... yet.

Also enables some functional code like this:

my $sum = 0;
Iterator
  ->new(\@array)
  ->Where(sub { $_[0] % 2 == 0 })
  ->ForEach(sub { $sum += $_[0] });

Which would sum up all the even values of an array.

My bottleneck is the iteration code:

$this->{Sub} = sub { 
   my @array = @{$this->{Array}};
   return $#array >= $this->{Index} ? $array[$this->{Index}++] : undef;
  };

Any pointers to speed this up?

jonathanpeppers
  • 26,115
  • 21
  • 99
  • 182
  • 5
    What are the expected advantages of this solution in comparison to regular `map`, `grep`, [List::Util](http://search.cpan.org/perldoc?List::Util) and [List::MoreUtils](http://search.cpan.org/perldoc?List::MoreUtils)? – zoul Jan 28 '11 at 14:29
  • 1
    I am doing some complicated algorithms (box packing problem), and functional programming is helping greatly. Converting normal foreachs to code similar to above has already cut down 100 lines of code (33% approx) or so. Also helps with maintainability, since it forces your subs to be more pure (no global vars). – jonathanpeppers Jan 28 '11 at 14:32
  • 2
    Functional programming is great. What makes you think that `map`, `grep` and company are not functional programming? I might be wrong, but it seems to me like you are just wrapping already working solutions in OO boilerplate. – zoul Jan 28 '11 at 14:35
  • Can you reproduce my Iterator code above (that sums even numbers) with these methods? And look more elegant? Please do, I would like to see as perl is not my strongest language to develop in. – jonathanpeppers Jan 28 '11 at 14:36
  • I’ve added an answer illustrating the sum. – zoul Jan 28 '11 at 14:45
  • If cutting 100 lines is 33% of the code then it is certainly not too complicated. – tster Jan 28 '11 at 14:54
  • 2
    You're writing OO modules, to produce functional interfaces, in perl. Really? zoul has it right, map, grep et al are far better suited. Don't try to make perl look like C#, please. – ptomli Jan 28 '11 at 14:56
  • Can you post the benchmark code to see how you are measuring this? Are you using -d:DProf? – tster Jan 28 '11 at 15:14
  • use Benchmark; use Benchmark ':hireswallclock'; Then I'm just doing an array with 10000 items on the example above. – jonathanpeppers Jan 28 '11 at 16:41
  • 1
    FWIW: there are array iterators written in C available to you in [List::MoreUtils](http://search.cpan.org/perldoc?List::MoreUtils). – Ether Jan 28 '11 at 16:49
  • I think the main problem with the initial design is that in Perl Where is called grep and ForEach is called... foreach or map. As mentionned below, $sum += $_ foreach grep { ! $_ % 2 } @array; performs much better than your solution (it looks better too IMHO). As someone else said below, don't fight the language. Trying to force arrays into objects leads to the result you got: slow and cumbersome. – mirod Jan 28 '11 at 19:28

9 Answers9

5

You might find it useful to read a bit of Higher Order Perl.

Dave Cross
  • 68,119
  • 3
  • 51
  • 97
5

this line:

my @array = @{$this->{Array}};

duplicates the array into @array, and I don't think you want to. Just do $#{$this->{Array}} to find the endpoint of your array.

araqnid
  • 127,052
  • 24
  • 157
  • 134
5

A much more efficient version:

package Iterator;
use strict;

#Constructor for Iterator type
sub new {
  my $type = shift;
  my $array = shift;
  my $this = {};
  $this->{array} = $array;
  $this->{index} = 0;
  bless($this, $type);
  return $this;
}

#Iterates next
sub Next {
  my $this = shift;
  return $this->{array}->[$this->{index}++];
}
tster
  • 17,883
  • 5
  • 53
  • 72
  • Is shift faster than my ($this) = @_? shift modifies the contents of the array @_, right? Still I will try my benchmarks. – jonathanpeppers Jan 28 '11 at 15:06
  • 4
    @Jonathan.Peppers, If you are worried about the performance of shift vs. $_[0] then you are worried about the wrong thing. The function call itself is much more expensive than either of those operations. – tster Jan 28 '11 at 15:08
  • I just pointed out shift versus @_, as this is the only different between the code above. It seems your changes are actually a little slower with my benchmarks. shift modifies the size of the array, while declaring the new my variable in the Next sub create overhead as well. I agree the function is more expensive than either. – jonathanpeppers Jan 28 '11 at 16:28
  • On my computer using `my $this = shift` takes 26-27 seconds with my benchmarks. Using $_[0] both times takes 28 seconds. – tster Jan 28 '11 at 16:35
  • Just notice you removed $this->{Sub} this is required for other methods in my package. Is this your speed gain? – jonathanpeppers Jan 28 '11 at 16:50
  • @Jonathan.Pappers, That is one of the speed gains. What do you need it for? You can always fake it by generating a sub on the fly. Still, I think the other methods should be changed. – tster Jan 28 '11 at 17:16
5

A bit late to the game here, but since you are concerned about performance, one of the largest bottlenecks in iterator type code is that the fields of your hash based object need to be dereferenced on each access. One way to combat this is to use closures in which the fields are closed over variables, avoiding unneeded dereferencing.

In my module List::Gen which contains a fairly performant implementation of lazy lists, I wrote the utility function curse which makes closure based objects behave like normal Perl objects.

Here is a short example of your iterator class written with curse. In a simple benchmark summing 1000 numbers, this method is twice as fast as yours, even after fixing all of the inefficiencies noted in the other answers.

{package Iterator;
    use List::Gen 'curse';
    sub new {
        my ($class, $array) = @_;
        my $index = 0;
        curse {
            next  => sub {$$array[$index++]},
            index => sub :lvalue {$index},
        } => $class
    }
    sub reset {shift->index = 0}
} 

If you are really pushing for more speed, since the next method does not need to be passed anything, you could even write:

my $next = $iterator->can('next');

while (defined (my $x = $next->()) {...}

Which will give you a 30% to 40% speed boost over a normal method call.

You can read the source of List::Gen for more advanced usage of curse

Eric Strom
  • 39,821
  • 2
  • 80
  • 152
4

Summing even numbers is easier done using grep and List::Util:

use List::Util 'sum';
say sum grep { not $_ % 2 } (1 .. 10); // 30

It seems very likely to me that that the code suggested by your question is over-engineering. Unless you can come up with a decent example that cannot be easily solved using the traditional Perl primitives.

zoul
  • 102,279
  • 44
  • 260
  • 354
  • I think I gave you too simple of a problem on this. Although I think we are a little off topic here. We could put 10 devs in a room for an hour and come up with 12 different opinions on functional programming. I don't think this is helpful, as my question is how to optimize my Iterator class (which is very few lines, so little benefit to using an existing perl package over it). – jonathanpeppers Jan 28 '11 at 14:48
  • 7
    @Jonathan Peppers, The thing is, if you put 10 Perl developers in a room none of them would come close to what you have done. – tster Jan 28 '11 at 14:52
  • 2
    @Jonathan: I just wanted to be sure that you know the old-school Perl building blocks before writing your own, as the best way to optimize a piece of code is to get rid of it. – zoul Jan 28 '11 at 14:52
  • I think the key is 10 *Perl* developers, I suffice to say most Perl devs are behind in terms of design patterns, etc. Perl was created in 1987, correct? Surely we have made some progress with other languages since. When perl was created, such notions of OOP and functional programming were in infancy. – jonathanpeppers Jan 28 '11 at 14:59
  • 7
    @Jonathon.Peppers, working against the language is going to hurt you in terms of performance, productivity, and learning. Just because Perl was created a long time ago doesn't mean that programming in it hasn't evolved. You can continue to look down on Perl programmers if you want. But none of them would write a slow piece of crap like the thing you wrote. I don't mean to be overly harsh here. But you shouldn't criticize people you know nothing about on the same page on which you are displaying severe ignorance. – tster Jan 28 '11 at 15:03
  • *Harsh*. I was not critizing perl developers, as they exist for good reason. Merely pointing out that a group programmers keen in different languages will come up with different solutions to the same problem. Elegance and maintainability are what I value in a solution to a problem, and from my experience a lot of existing perl code I have seen has been very difficult to maintain. I am merely trying to combat this problem. None of this is related to my question. – jonathanpeppers Jan 28 '11 at 16:37
  • @Jonathan: what makes you think that the Perl language hasn't changed and incorporated new concepts since 1987? e.g. take a look at [Moose](http://search.cpan.org/dist/Moose/). And incidentally, from your last comment it *does* appear that you are criticizing Perl developers. You seem to think they exist to write inelegant and unmaintainable code on an old and unevolved platform. You still aren't understanding the points presented to you, that there are language idioms you are totally ignoring that make the problem much simpler than you make it out to be. – Ether Jan 28 '11 at 16:52
  • Is this related to my question? Can Moose be used to get better performance out of my Iterator class? – jonathanpeppers Jan 28 '11 at 17:30
  • @Jonathon.Peppers, mentioning Moose is in response to your comments `most Perl devs are behind in terms of design patterns, etc` & `When perl was created, such notions of OOP and functional programming were in infancy` – tster Jan 28 '11 at 18:07
  • @Jonathan: it's related to your claim that Perl and its developers is behind in terms of design patterns and has no notions of OOP and functional programming, and the boldness of attempting to write C# in Perl and then using that poorly-performing code to slam the entire language. Please re-read tster's comments above. – Ether Jan 28 '11 at 18:09
  • Then why are these concepts being promoted to first class citizens in Perl 6? (OOP and iterators built into the language) The Perl language is obviously behind in some way if they are just now choosing to add such features. I am merely pointing out that other development communities are more progressive than the perl community. Proof of such is the outrage shown by you guys of my simple Iterator class. – jonathanpeppers Jan 28 '11 at 19:15
  • 1
    @Jonathan.Peppers, if you don't want to consider the possibility that the people who post here trying to help you might possibly know things that you don't know, then what is the point of asking questions here? BTW, my primary language is C#, and I make heavy use of LINQ. And we aren't "outraged" by your implementation so much as your attitude at our suggestion that you use existing Perl libraries that come packaged with most installations of the language. – tster Jan 28 '11 at 19:27
  • Well, my question was how to get better performance out of a simple package I had written. Not asking which package should I use instead. You goal here seems to have been to criticize my solution of a problem rather than answer the question I asked. – jonathanpeppers Jan 28 '11 at 20:34
  • @Jonathan.Peppers, you know even the fastest implementation on here of the iterator is 550% slower on my machine than a regular Perl foreach, map or grep. – tster Jan 28 '11 at 21:07
  • Well, can you convert complicated Linq queries to using just map and grep? Seems quite difficult without an iterator. – jonathanpeppers Jan 28 '11 at 22:34
  • That would be a nice new question I guess. – zoul Jan 29 '11 at 11:26
4

Have a look at List::Util and List::MoreUtils for utilities that may help you with this. You can even use perl5i for a more modern looking syntax.

Example:

use perl5i::2;

my @nums = (0..100);
my $sumEven = @nums->grep(sub { $_ % 2 == 0 })->reduce(sub { $a+$b });

say $sumEven;
mscha
  • 6,509
  • 3
  • 24
  • 40
2

A much simpler Perl iterator:

my @array = (1, 2, 3, 4);
while (my $i = shift @array)
{
    print $i . "\n";
}
Zaid
  • 36,680
  • 16
  • 86
  • 155
tster
  • 17,883
  • 5
  • 53
  • 72
2

There is already an array iterator in CPAN, so you can look at its approach if you have not done it yet.

By the way in your code you have:

#set default values
$this->{Array} = @_[1];

I assume you want to say $_[1]. With @_[1] you are requesting an array slice of one element. At the end the result is the same but the semantics isn't. The curious thing is that I was expecting to have an array of one element if I do @_[1] or an error but tested in the debugger and you obtain the scalar (at least in perl 5.10). Perl 6 will go for this behaviour anyway and will not change sigil for accessing elements in arrays or hashes so you are coding 'advanced' Perl ;-)

Pablo Marin-Garcia
  • 4,151
  • 2
  • 32
  • 50
2

Don't unload the stored array. You're copying every element of an array from where it is pointed at by $this->{Array} to the local list @array when you do this:

my @array = @{$this->{Array}};

Also if you know that you are going to stop when you hit undef, then you don't have to even check bounds.

$this->{Sub} = sub { return $this->{Array}[++$this->{Index}]; }

Is all you need. When {Index} gets out of range, it will return undef.

In addition, you can write your expression in Perl like:

$sum += $_ foreach grep { $_ % 2 == 0 } @array;
Axeman
  • 29,660
  • 2
  • 47
  • 102