-1

I am using the following Perl script to search multiple files, and print out the entire text line when a particular number in that line is matched:

#!/perl/bin/perl

use strict;
use warnings;

my @files = <c:/perl64/myfiles/*>;

foreach my $file (@files) {
  open my $file_h, '<', $file
   or die "Can't open $file: $!";

  while (<$file_h>) {
print "$file $_" if /\b1203\b/; 
print "$file $_" if /\b1204\b/;
print "$file $_" if /\b1207\b/;
  } }

The script works very well to match and print each time the number exists on a line in one or more of the files. My question is that I would like to be able to identify when there is no match at all for that number in any of the files.

We are parsing multiple files with thousands of lines so to find the delta's (i.e. NO MATCH of this number in any of the files) is very time consuming.

To clarify, I still need to match and print every time the number is matched in every file, not just if it was matched once. also the line output where it matches is critical to be printed.

ultimately this is just to show if the number was not matched anywhere in any of the files.

Source edited for readability

#!/perl/bin/perl

use strict;
use warnings;

my @files = <c:/perl64/myfiles/*>;

foreach my $file ( @files ) {

    open my $file_h, '<', $file or die "Can't open $file: $!";

    while ( <$file_h> ) {
        
        print "$file $_" if /\b1203\b/;
        print "$file $_" if /\b1204\b/;
        print "$file $_" if /\b1207\b/;
    }
}
Community
  • 1
  • 1
normbeef
  • 27
  • 3
  • 2
    In your [previous question](https://stackoverflow.com/questions/47471848/check-whether-a-field-from-a-line-of-text-line-matches-a-value) **Dave Cross** commented *"I have edited your code to add indentation. You're welcome, but please do it yourself in the future"* Please take proper note of that advice. It is very shabby behaviour to present ugly and unreadable code when you are asking for free help to fix it. – Borodin Dec 04 '17 at 14:44

1 Answers1

0

I would like to be able to identify when there is no match at all for that number in any of the files

Since you are going over several files, you need to remember that you saw a certain number once. A hash for counting is very useful here, and a common approach to this kind of problem.

It makes sense to at the same time move the numbers (or patterns) into an array. That way you only need to list them once in your code, and the overall code becomes less cluttered.

my @numbers = (1203, 1204, 1205);
my %seen;
foreach my $file (@files) {
    # ...
    while (<$file_h>) {
        foreach my $number (@numbers) {
            if (/\b$number\b/) {
                print "$file $_"; 
                $seen{$number} = 1; # we don't care how many, just that we saw it
             }
        }
    }
}

# At this point, %seen contains a key for every number that was seen at least once.
# If a number was not seen, it will not have a key.

# output numbers that were not seen
foreach my $number (@numbers) {
    print "no match: $_\n" unless exists $seen{$number};
}
simbabque
  • 53,749
  • 8
  • 73
  • 136
  • 1
    *"we don't care how many, just that we saw it"* But `++$seen{$number}` provides potentially useful extra information, for debugging perhaps. The code is (two characters!) shorter, and as long as the hash element is created the rest of your code will still work. – Borodin Dec 04 '17 at 16:41
  • @Borodin indeed. But it's also insignificantly slower when we increment. ;) – simbabque Dec 04 '17 at 16:42