What is the easiest way to get a key with the highest value from a hash in Perl?
8 Answers
While the solution with sort:
(sort {$hash{$a} <=> $hash{$b}} keys %hash)[0]
found in some of the other answers is quite elegant, it doesn't perform as nicely as it looks. First off, the sort transforms an O(n)
search search operation into an O(n log n)
one. Secondly, the sort solution has n log n
hash look-ups. Hash look-ups are very good for certain operations, but when working with the entire hash, look-ups will be slower than using each
, keys
, or values
to iterate through the data structure. This is because the iterators do not need to calculate the hashes of keys, nor do they need to repeatedly walk through bins to find the values. And the overhead is not constant, but increasing as the hashes get larger.
Here are a few faster solutions:
use strict;
use warnings;
my %hash = (
small => 1,
medium => 5,
largest => 10,
large => 8,
tiny => 0.1,
);
Here is a solution using the each
iterator (an O(1)
operation done n
times):
sub largest_value (\%) {
my $hash = shift;
keys %$hash; # reset the each iterator
my ($large_key, $large_val) = each %$hash;
while (my ($key, $val) = each %$hash) {
if ($val > $large_val) {
$large_val = $val;
$large_key = $key;
}
}
$large_key
}
print largest_value %hash; # prints 'largest'
Or a faster version that trades memory for speed (it makes a copy of the hash):
sub largest_value_mem (\%) {
my $hash = shift;
my ($key, @keys) = keys %$hash;
my ($big, @vals) = values %$hash;
for (0 .. $#keys) {
if ($vals[$_] > $big) {
$big = $vals[$_];
$key = $keys[$_];
}
}
$key
}
print largest_value_mem %hash; # prints 'largest'
Here is the performance with various hash sizes:
10 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 111565/s -- -8% -13%
largest_value 121743/s 9% -- -5%
largest_value_mem 127783/s 15% 5% --
50 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 24912/s -- -37% -40%
largest_value 39361/s 58% -- -6%
largest_value_mem 41810/s 68% 6% --
100 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 9894/s -- -50% -56%
largest_value 19680/s 99% -- -12%
largest_value_mem 22371/s 126% 14% --
1,000 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 668/s -- -69% -71%
largest_value 2183/s 227% -- -7%
largest_value_mem 2341/s 250% 7% --
10,000 keys: Rate largest_with_sort largest_value largest_value_mem
largest_with_sort 46.5/s -- -79% -81%
largest_value 216/s 365% -- -11%
largest_value_mem 242/s 421% 12% --
As you can see, if memory isn't much of an issue, the version with internal arrays is fastest, closely followed by the each
iterator, and in a distant third... sort

- 39,821
- 2
- 80
- 152
-
3Thorough answer. One comment though: the amortized complexity of a hash lookup is O(1), not O(log n). – jkasnicki May 22 '10 at 15:10
-
1comparing real world speeds of hash lookup to array lookup still shows a nonlinear relationship. with 10 elements, an array is %50 faster than a hash, with 10000 elements it is %100 faster, with 1,000,000 elements it is 210% faster... – Eric Strom May 22 '10 at 17:16
Not sure why everyone is doing this by hand...
use List::Util qw( reduce );
my $max_val_key = reduce { $hash{$a} > $hash{$b} ? $a : $b } keys %hash;

- 45,363
- 14
- 64
- 102
The following is more space-efficient and will run in O(n) instead of O(n log n) as compared to the other answers which sort the hash. It assumes values are integers greater than 0 and the hash is not empty, but should be easily extended for your case.
my $key_for_max_value;
my $max_value = -1;
while ((my $key, my $value) = each %hash) {
if ($value > $max_value) {
$max_value = $value;
$max_key = $key;
}
}
$key_for_max_value will now be the key corresponding to the highest value.

- 5,726
- 1
- 15
- 9
-
4There is an assumption in your code that the values of the hash aren't all negative numbers less than -1. You should just make $max_value the value of the first thing seen or something. – May 22 '10 at 04:18
-
4Nice to know _someone_ out there still appreciates efficiency over short-handedness. Good explanation, too. – amphetamachine May 22 '10 at 04:42
-
@Kinopiko: And that can be done with something like `my $max_value = undef;` and later, change the `if` to `if (! defined $max_value || $value > $max_value)`. – Robert P May 22 '10 at 07:33
-
1@amphetamachine for reasonably sized data sets, this solution is very likely to be slower than the one using `sort`. – hobbs May 22 '10 at 08:07
-
-
2@Alnitak by having a smaller constant factor. Let f(n) = n * log(n) / log(10) and g(n) = n * 1000000. f(n) = O(n log n) and g(n) = O(n). Now let n = 10. f(10) is ten, and g(10) is ten million. Furthermore, f(n) will be less than g(n) as long as n is less than ten to the millionth power. This despite the fact that f(n) dominates g(n). – hobbs May 22 '10 at 10:31
-
1(It should be noted that since log n is considered a fairly slow-growing function, O(n) and O(n log n) are therefore "not very different", meaning it doesn't take a very large constant factor advantage for an O(n) function to beat out an O(n log n) one at small n.) – hobbs May 22 '10 at 10:44
-
2@hobbs I don't think this solution will ever be slower than one involving sorting. Your argument is valid in general (constant factors can make O(n log n) preferable for small n), but in this case the constant factor on the O(n) solution is small: we look at each element exactly once and do a very small amount of computation with it. Finally, the real win of this solution is the space-savings. Sorting will take O(n) space, while this solution takes O(1) space. See @Eric Strom answer for another discussion and performance numbers. – jkasnicki May 22 '10 at 15:01
-
@jkasnicki - well put. Sure, there _might_ be particular cases where `O(n log n)` is smaller than `O(n)` (for small values of `n`), but this isn't one of them! – Alnitak May 22 '10 at 17:04
-
1@jkasnicki : Put a short-circuit operator to define `$max_value` on the first pass: `$max_value ||= $value;`. That way you can get rid of the `-1` assumption – Zaid May 23 '10 at 10:14
The keys sorted by value, from lowest to highest:
sort { $hash{$a} <=> $hash{$b} } keys %hash
The keys sorted by value, from highest to lowest:
reverse sort { $hash{$a} <=> $hash{$b} } keys %hash
And the first element
(reverse sort { $hash{$a} <=> $hash{$b} } keys %hash)[0]
Replace the spaceship with cmp
to taste.

- 42,082
- 9
- 61
- 86
-
-
Because he wants the key, not the value. The value is what to sort by, the key is what to return. Unless I am misreading the question. – jrockway May 22 '10 at 04:05
-
-
1
my ($max_key, $max_val) = each %hash or die "hash is empty";
while (my ($key, $val) = each %hash) {
$max_key = $key, $max_val = $val if $val > $max_val;
}

- 9,943
- 4
- 29
- 57
my $highest_val = (keys {$hash{$b} <=> $hash{$a}} keys %hash)[0];

- 27,620
- 12
- 60
- 72
-
That returns the key that's the highest value. I assume he wants the key that maps to the highest value. Otherwise, the question is too simple to be asking :) (And in that case, why not just "reverse sort keys %hash"?) – jrockway May 22 '10 at 03:58
-
2It depends what you mean by "value" here. Usually a hash is thought of as key/value pairs, so I'd assume the same thing as jrockway. But it could also mean what amphetamachine said. The questioner should clarify. – May 22 '10 at 04:01
-
@jrockway - `And in that case, why not just "reverse sort keys %hash"?` - Because that's a lexical sort, and `sort {$b <=> $a}` hits two birds with one stone in that it's both a numeric sort AND it's reversed. – amphetamachine May 22 '10 at 04:28
-
1
my $highest_val = (sort { $hash{$a} <=> $hash{$b} } keys %hash)[0];
is likely to be what you want.
If you have a very large hash, you might want to use something like a Schwartzian transform:
my @array = map {[$hash{$_},$_]} keys %hash;
my $key_with_highest_value = (sort { $a->[0] <=> $b->[0] } @array)[0]->[1]

- 4,325
- 2
- 28
- 40
-
This is more typing, but is O(n) instead of O(n log n), which is generally a good thing. If your list is big. – jrockway May 22 '10 at 04:06
-
1The Schwartzian transform here only serves to reduce the number of hash table lookups, and does **not** change the complexity of the search - it's still O(n log n). The iterative approach from @jkasnicki is superior. – Alnitak May 22 '10 at 09:03
If performance isn't an issue, I'd suggest a more literate programming solution.
use List::Util qw(max);
max keys %hash;

- 9,679
- 7
- 62
- 108