Benefits of using hash references?

Question

I saw an article on perl script performance.

One of the things they mentioned is using hash references instead of accessing the hash directly each and everytime.

What benefit to do I gain from referring to the hash instead of a direct access?

My script reads from a list of server names that in theory could be as much as 100 machines if someone needed that many. So any boost I can give to my script would be great.

You realize that a hash of 100 items is tiny, and any operation will be almost instantaneous on decent hardware? — Rafe Kettler, Apr 17 '11 at 08:03
Uh I thought that was getting big.., what is considered big? Millions? Well the thing would be a hash of hashes, and the those 100 servers could have several file paths and such inside. — AtomicPorkchop, Apr 17 '11 at 08:08
today, probably a thousand elements. Considering that a modest laptop today has 4GB ram and a dual-core, 100 is really nothing. — Rafe Kettler, Apr 17 '11 at 08:13
I read somehow related article about [Big data buzzword](http://www.xaprb.com/blog/2011/03/31/big-data-is-how-big-exactly/) recently. It is interesting what they consider big. — bvr, Apr 17 '11 at 14:32

score 8 · Accepted Answer · answered Apr 17 '11 at 08:17

I don't think there's much of an advantage of $hashref->{"foo"} over $hash{"foo"}. There's probably a small advantage in passing hash refs instead of full hashes to subroutines, but that's about all I can think of. I agree with the comment by Rafe that a hash of 100 items isn't likely to give you performance problems either way. Unless you know you have a performance problem related to hash table access, don't bother with this.

"It's easier to optimize a debugged program than to debug an optimized program."

I love the quote, I can see how that would be true. – AtomicPorkchop Apr 17 '11 at 08:21 — AtomicPorkchop, Apr 17 '11 at 08:21

score 2 · Answer 2 · answered Apr 17 '11 at 08:20

I commented earlier that 100 is tiny for a hash. I'll qualify this with a more general statement:

Don't worry about it unless it's a problem. Is your script running slow? If not, then don't fix what's not broken. Premature optimizations are bad for readability and can often lead to bugs. This was a bigger issue in 2004 when the article I presume that you're reading was written. But today, RAM is cheap.

That said, the reason why using references nets better performance than passing by value is that, when you pass a hash as an argument to a sub, it normally has to be copied which uses more memory. This is only an optimization that needs to be made if a.) you pass big hashes to functions a lot and b.) this causes you to use too much memory.

To slightly change the subject, how can I tell how much memory my hash using up? — AtomicPorkchop, Apr 17 '11 at 08:31

score 0 · Answer 3 · edited Apr 17 '11 at 10:30

Well, as Rafe mentioned already, a hash with a 100 elements is not really big. One could argue that using a hash reference doesn't give you much advantage over using a normal hash - however it's also not giving you a particular disadvantage (at least I never ran into one). So it's not as bad a premature optimization as one might think.

If your script runs too slow then you might want to use a profiler to find out where you are losing the time.

score 0 · Answer 4 · answered Apr 18 '11 at 01:49

Sorry, but that article was wrong if that's what it said. There's no way that dereferencing a reference then accessing a hash element can take less time than just accessing a hash element.

>perl -MO=Concise,-exec -e"$x = $h{x}"
...
3  <#> gv[*h] s
4  <1> rv2hv sKR/1
5  <$> const[PV "x"] s/BARE
6  <2> helem sK/2
...

>perl -MO=Concise,-exec -e"$x = $h->{x}"
...
3  <#> gv[*h] s
4  <1> rv2sv sKM/DREFHV,1    <---
5  <1> rv2hv[t3] sKR/1
6  <$> const[PV "x"] s/BARE
7  <2> helem sK/2
...

That said, the amount of extra time the deref takes should be so minute as to not matter.

Benefits of using hash references?

4 Answers4