0

I've searched other many Stack questions on map however this requirement is particular and well try as I might I cannot quite get the solution I am looking for, or I think that does exist.

This question is simply about performance.

As limited, background, this code segment used in decoding incoming tokens so it's used on every web request and therefore the performance is critical and I know "map" can be used so want to use it.

Here is a trimmed down but nevertheless fully working code segment which I am currently using and works perfectly well:

use strict;
use Data::Dumper qw (Dumper);

my $api_token = { array => [ 'user_id', 'session_id', 'expiry' ], max => 3, name => 'session' };
my $token_got = [ 9923232345812112323, 1111323232000000465, 1002323001752323232 ];

my $rt;
for (my $i=0; $i<scalar @{$api_token->{array}}; $i++) {
  $rt->{$api_token->{array}->[$i]} = $token_got->[$i];
}

$rt->{type} = $api_token->{name};
print Dumper ($rt) . "\n";

The question is: What is the absolute BEST POSSIBLE PERL CODE to replicate the foreach statement above in terms of performance?

toolic
  • 57,801
  • 17
  • 75
  • 117
Mark Arnold
  • 253
  • 3
  • 9

1 Answers1

4

Looks like you only need a hash slice

my %rt;

@rt{ @{ $api_token->{array} } } = @$token_got;

Or, if the hash reference is needed

my $rt;

@{ $rt } { @{ $api_token->{array} } } = @$token_got;

or with the newer postfix dereferencing, on both array and hash slices, perhaps a bit nicer

my $rt;

$rt->@{ $api_token->{array}->@* } = @$token_got;

One can also do it using List::MoreUtils::mesh, and in one statement

my $rt = { mesh @{ $api_token->{array} }, @$token_got };

or with pairwise from the same library

my $rt = { pairwise { $a, $b } @{ $api_token->{array} }, @$token_got };

These go via C code if the library gets installed with List::MoreUtils::XS.


Benchmarked all above, with the tiny datasets from the question (realistic though?), and whatever implementation mesh/pairwise have they are multiple times as slow as the others.

On an old laptop with v5.26

              Rate use_pair use_mesh use_href use_post use_hash
use_pair  373639/s       --     -36%     -67%     -67%     -68%
use_mesh  580214/s      55%       --     -49%     -49%     -51%
use_href 1129422/s     202%      95%       --      -1%      -5%
use_post 1140634/s     205%      97%       1%       --      -4%
use_hash 1184835/s     217%     104%       5%       4%       --

On a server with v5.36 the numbers are around 160%--170% against pairwise (with mesh being a bit faster than it, similarly to above)

Of the others, on the laptop the hash-based one is always a few percent quicker, while on a server with v5.36 they are all very close. Easy to call it a tie.


The following is edit by OP, who timed a 61% speedup (see comments)

CHANGED CODE:

@rt{ @{ $api_token->{array} } } = @$token_got; ### much faster onliner replaced the loop. @zdim credit
zdim
  • 64,580
  • 5
  • 52
  • 81
  • Now that looks cool... very easy now I see it but somehow it escaped me. I need to make a reference using ``my $result = \%rt;`` Is it possible to **ref** in a one-liner however yes good! OK, I understand that this is not possible in a one-liner due to your postfix dereferencing. – Mark Arnold Oct 08 '22 at 06:06
  • @MarkArnold Oh right -- that wouldn't be a problem (it's quick), but one can do it directly, added. – zdim Oct 08 '22 at 06:08
  • @MarkArnold I haven't timed this, which is your ultimate purpose I guess, but I'd expect this to surely be at least somewhat faster. Looking into whether there is some other way to speed it up... – zdim Oct 08 '22 at 06:09
  • The hash reference code did not work however I will make some benchmarks. You are a great person. I see how it works in perl now much more clearly and how clever you can be. – Mark Arnold Oct 08 '22 at 06:14
  • You original answer I think is perfect. I will ``return \%rt;`` in my routine and I think this works for me @zdim (I will bench it before accepting answer but I think it will be faster). PairWise looks good... leave it with me and I will come back with benchmarks. Very great. – Mark Arnold Oct 08 '22 at 06:19
  • @MarkArnold Added another way, which runs in one statement and could well be fast. Didn't get to benchmark them yet, but will as I'm getting curious :) – zdim Oct 08 '22 at 06:20
  • @MarkArnold Ugh, I had a blooper with the second one, just copied the first an forgot to change to dereferencing (instead of copying dirctly from my tests here!). Fixed – zdim Oct 08 '22 at 06:22
  • ` Rate _original _return ref _original 1140251/s -- -38% _return ref 1834862/s 61% -- ` In real-terms on my systems your original code performs 61% faster in real-terms. So that is a 61% saving for every HTPP request. Thank you. – Mark Arnold Oct 08 '22 at 06:42
  • @MarkArnold "_61% faster in real-terms._" -- awesome :)) Made my day :). I updated the last part about my benchmarking, corrected some description. Apparently, either hash or hashref (but not `pairwise`, which I had hopes for :(. But by all means time on your systems! – zdim Oct 08 '22 at 06:45
  • @MarkArnold I corrected the statement about using "postfix dereferencing" and added that way as well. (It behaves exactly the same as the hashref one.) I kept your edit (it asked me to approve it), with my comment. Feel free to change if you wish – zdim Oct 08 '22 at 07:42
  • @MarkArnold I still had that hashref (second one) goofed up, sorry. Fixed. Added another library function (`mesh`) but which is sitll much slower and a few more minor edits. – zdim Oct 08 '22 at 09:06
  • To @zdim and anyone interested... the newer postfix dereferencing code is very beautiful and surely the absolute optimum solution. It delivers improved performance by another 0.7% ish on average. I've actually kept the original code as it's easier to understand overall and the performance is similar. However the question was about performance and actually this **is** the the best performance if only slightly more than the more readable code. (for me at least!). – Mark Arnold Oct 10 '22 at 05:10