3

I'm comparing line against line of two text files, ref.txt (reference) and log.txt. But there may be an arbitrary number of blank lines in either file that I'd like to ignore; how can I accomplish this?

ref.txt

one

two


three



end

log.txt

one
two
three
end

There would be no incorrect log lines in the output, in other words log.txt matches with ref.txt.

What I like to accomplish in pseudo code:

while (traversing both files at same time) {
    if ($l is blank line || $r is blank line) {
        if ($l is blank line)
            skip to next non-blank line
        if ($r is blank line)
            skip to next non-blank line
    }
    #continue with line by line comparison...
}

My current code:

use strict;
use warnings;

my $logPath    = ${ARGV [0]};
my $refLogPath = ${ARGV [1]} my $r;    #ref log line
my $l;                                 #log line

open INLOG, $logPath    or die $!;
open INREF, $refLogPath or die $!;

while (defined($l = <INLOG>) and defined($r = <INREF>)) {
    #code for skipping blank lines?
    if ($l ne $r) {
        print $l, "\n";                #Output incorrect line in log file
        $boolRef = 0;                  #false==0
    }
}
daxim
  • 39,270
  • 4
  • 65
  • 132
jerryh91
  • 1,777
  • 10
  • 46
  • 77

6 Answers6

8

If you are on a Linux platform, use :

diff -B ref.txt log.txt

The -B option causes changes that just insert or delete blank lines to be ignored

JRFerguson
  • 7,426
  • 2
  • 32
  • 36
2

You can skip blank lines by comparing it to this regular expression:

next if $line =~ /^\s*$/

This will match any white space or newline characters which can potentially make up a blank line.

squiguy
  • 32,370
  • 6
  • 56
  • 63
  • It seems more understandable (to me, at least) to write that as `next unless $line =~ /\S/`. – Dave Cross Jul 20 '12 at 10:34
  • @DaveCross I suppose that your version ensures that there is something on the line read. There is always more the one way to do it int Perl! – squiguy Jul 20 '12 at 13:05
  • Yeah. I switched to my approach after dealing with one too many files where the "empty" lines actually contained spaces and/or tabs. – Dave Cross Jul 20 '12 at 13:11
2

This way seems the most "perl-like" to me. No fancy loops or anything, just slurp the files and grep out the blank lines.

use warnings;

$f1 = "path/file/1";
$f2 = "path/file/2";

open(IN1, "<$f1") or die "Cannot open file: $f1 ($!)\n";
open(IN2, "<$f2") or die "Cannot open file: $f2 ($!)\n";

chomp(@lines1 = <IN1>); # slurp the files
chomp(@lines2 = <IN2>);

@l1 = grep(!/^\s*$/,@lines1); # get the files without empty lines
@l2 = grep(!/^\s*$/,@lines2);

# something like this to print the non-matching lines
for $i (0 .. $#l1) {
   print "[$f1 $i]: $l1[$i]\n[$f2 $i]: $l2[$i]\n" if($l1[$i] ne $l2[$i]);
}
kevlar1818
  • 3,055
  • 6
  • 29
  • 43
  • Perhaps rewrite those greps as `@l1 = grep(/\S/, @lines1)` etc. – Dave Cross Jul 20 '12 at 10:35
  • How do I retrieve individual lines from the @l1 and @l2? – jerryh91 Jul 20 '12 at 15:47
  • This isn't perfect, as one mismatched line will make all the ones below it be mismatches too. I thought I'd share this as an exploration of perl's file slurping/grepping ability. Definitely just use `diff -B` if you can. – kevlar1818 Jul 20 '12 at 16:21
0

You can loop to find each line, each time:

while(1) {
    while(defined($l = <INLOG>) and $l eq "") {}
    while(defined($r = <INREF>) and $r eq "") {}

    if(!defined($l) or !defined($r)) {
        break;
    }

    if($l ne $r) {
        print $l, "\n";
        $boolRef = 0;
    }
}
Ry-
  • 218,210
  • 55
  • 464
  • 476
0
man diff

diff -B ref.txt log.txt
toolic
  • 57,801
  • 17
  • 75
  • 117
0
# line skipping code
while (defined($l=<INLOG>) && $l =~ /^$/ ) {}  # no-op loop exits with $l that has length

while (defined($r=<INREF>) && $r =~ /^$/ ) {}  # no-op loop exits with $r that has length
marklark
  • 860
  • 1
  • 8
  • 18