63

I have a Perl script that is counting the number of occurrences of various strings in a text file. I want to be able to check if a certain string is not yet a key in the hash. Is there a better way of doing this altogether?

Here is what I am doing:

foreach $line (@lines){
    if(($line =~ m|my regex|) )
    {
        $string = $1;
        if ($string is not a key in %strings) # "strings" is an associative array
        {
            $strings{$string} = 1;
        }
        else
        {
            $n = ($strings{$string});
            $strings{$string} = $n +1;
        }
    }
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 5
    The question is, why are you even bothering with that? If it doesn't exist then $n will be undef. Undef's numeric value is 0, so $n+1=1. There's no need to check if it exists in the hash to begin with. – Nathan Fellman Jun 17 '09 at 08:00

5 Answers5

127

I believe to check if a key exists in a hash you just do

if (exists $strings{$string}) {
    ...
} else {
    ...
}
cpjolicoeur
  • 12,766
  • 7
  • 48
  • 59
  • 26
    Be aware that perl will autovivicate any intermediary keys that do not exist in a multidimensional hash in order to "check" if the key your looking for in the last hash exists. It's not a problem with a simple hash like this example but .. my %test = (); print "bar" if(exists $test{'foo'}{'bar'}); # perl just autovivified the foo key in order to look for bar print "foo exists now and you might not have expected that!" if(exists $test{'foo'}); – Drew Sep 15 '15 at 13:52
  • @Drew - **Thanks for the reminder!** I'd glossed over an earlier spot in my code where I'd done a "if (my $value = $test{$foo}{$bar})" and was completely stumped why a later "exists ($test{$foo})" returned true. – Randall Jan 20 '20 at 15:40
10

I would counsel against using if ($hash{$key}) since it will not do what you expect if the key exists but its value is zero or empty.

RET
  • 9,100
  • 1
  • 28
  • 33
  • 1
    Those certain circumstances are only for nested keys. For this problem, exists is the answer. Don't use exists for nested keys in one shot. – brian d foy Jun 16 '09 at 22:20
  • 1
    Downvote is still a bit harsh though - the warning is not invalidated by the simplicity of the script in this question. The more important point is the issue of using if($hash{$key}) with neither defined nor exists: the "zero but true" problem. – RET Jun 16 '09 at 23:52
  • The "zero but true" thing deserves an upvote. But what you said about autovivification is simply wrong and deserves a downvote. – innaM Jun 17 '09 at 07:59
  • The warning here is true in a way - the autovivification might happen, though not with the given example - but the proposed answer with defined() has exactly the same problem, so this is no solution at all. – ijw Jun 17 '09 at 12:18
  • Indeed - fair comment. It was too early in the morning when I wrote that answer, so I've rewritten it now I'm sufficiently caffeinated. – RET Jun 18 '09 at 06:59
  • Upvoted now. It is a fair warning now the autovivification bit has been removed. – Leonardo Herrera Jun 22 '09 at 16:23
9

Well, your whole code can be limited to:

foreach $line (@lines){
        $strings{$1}++ if $line =~ m|my regex|;
}

If the value is not there, ++ operator will assume it to be 0 (and then increment to 1). If it is already there - it will simply be incremented.

6

I guess that this code should answer your question:

use strict;
use warnings;

my @keys = qw/one two three two/;
my %hash;
for my $key (@keys)
{
    $hash{$key}++;
}

for my $key (keys %hash)
{
   print "$key: ", $hash{$key}, "\n";
}

Output:

three: 1
one: 1
two: 2

The iteration can be simplified to:

$hash{$_}++ for (@keys);

(See $_ in perlvar.) And you can even write something like this:

$hash{$_}++ or print "Found new value: $_.\n" for (@keys);

Which reports each key the first time it’s found.

zoul
  • 102,279
  • 44
  • 260
  • 354
  • Yeah, the thing is I won't know ahead of time what the keys will be. –  Jun 16 '09 at 20:12
  • 1
    Yes, you don't need to check for presence of the key for this purpose. You can simply say $strings{$1}++ . If the key is not there, it will be added with undef as value, which ++ will interpret as 0 for you. –  Jun 16 '09 at 20:29
  • Sure. The point is that you can replace the whole body of your cycle (under the if) with $strings{$1}++. – zoul Jun 16 '09 at 20:30
-1

You can just go with:

if(!$strings{$string}) ....
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
AJ.
  • 16,368
  • 20
  • 95
  • 150
  • 8
    This only works if all of the keys have values that are not false. In general, that's a bad assumption. Use exists(), which is especially designed just for this. – brian d foy Jun 16 '09 at 22:21
  • 2
    @brian de foy - Ah ha. I knew I shouldn't have answered :-) – AJ. Jun 17 '09 at 01:28
  • Furthermore, your construct *creates* an entry in the hash. For the question at hand this is probably irrelevant, but for other cases it might be relevant. Using exists() also circumvents this problem and does not create an entry in the hash. – user55400 Jun 17 '09 at 09:54
  • 1
    @blixor: No, it doesn't. Try perl -le 'print "ok" if !$a{hello}; print keys %a' – Hynek -Pichi- Vychodil Jun 17 '09 at 22:25
  • 1
    Only in nested hashes do you have a problem that intermediate accesses create entries. So `$a{$x}{$y}` will create `$a{$x}`, regardless if you use `exists` or any other approach. – rustyx May 23 '16 at 14:57