0

I have these two files which I would like to compare its contents numerically.

Text1:

C_A C_A 0.0000 0.0000 0 0 50 47 100 390
C_A/I0/I0 INV 0.0200 0.2210 0 0 20 200 30 100
C_A/I0/I2 INV 1.0400 0.2210 0 0 530 200 250 261

Text2:

C_A C_A 0.0000 0 0 0 50 47 100 390
C_A/I0/I0 INV 0.0200 0.2213 0 0 20 200 30 100
C_A/I0/I2 INV 1.04 0.2210 0 0 530 200.00 250 261

Desired Output:

C_A/I0/I0 INV has mismatch property.

I have tried this so far but I got errors of use of uninitialized value. Please do advise me. Thanks for your help in advance.

Edited CODE:

use strict;
use warnings;
my %ref_data;

open my $fh, '<', 'Text1' or die $!;
while (<$fh>) {
    chomp;
    my ($occurname, $tempname, @data) = split;
    $ref_data{$occurname} = \@data;
    }

open $fh, '<', 'Text2' or die $!;
while (<$fh>) {
    chomp;
    my ($occurname, $tempname, @data1) = split;
    my $data = $ref_data{$occurname};
    print "$occurname $tempname has mismatch property\n" if 
        grep { $data1[$_] != $data->[$_] } 0 .. $#data1;
      }
    } 
annel
  • 59
  • 7
  • `!=` is the numeric inequality operator. You want `ne` for strings. If you want numeric comparison, you need to use the numbers as numbers, not as a string. – TLP Dec 09 '13 at 01:44
  • hi, yes, I need to do a numeric comparison, can you give suggestion on how to use the numbers as numbers? I am not sure of how. – annel Dec 09 '13 at 01:50
  • 1
    Don't store the data as a string `"@data"`, store it as an array `\@data`. Then compare the arrays in a loop, e.g. `if (grep { $array[$_] == $array1[$_] } 0 .. $array) ` – TLP Dec 09 '13 at 01:56
  • I have corrected the codes according to your suggestion. But I still have errors as updated above. – annel Dec 09 '13 at 02:36
  • You might want to do `chomp` on both inputs so they are constructed the same. – woolstar Dec 09 '13 at 02:38
  • 1
    @woolstar I think `split` is doing that already. – codnodder Dec 09 '13 at 02:45

4 Answers4

2

Perhaps the following will be helpful:

use strict;
use warnings;

my $file2 = pop;
my %ref_data;

while (<>) {
    my ( $occurname, $tempname, @data1 ) = split;
    $ref_data{$occurname} = \@data1;
}

push @ARGV, $file2;

while (<>) {
    my ( $occurname, $tempname, @data2 ) = split;
    my $data1 = $ref_data{$occurname};

    for ( 0 .. $#data2 ) {
        if ( $data1->[$_] != $data2[$_] ) {
            print "$occurname $tempname has mismatch property\n";
            last;
        }
    }
}

Usage: >perl script.pl Text1 Text2 [>outFile]

The last, optional parameter directs output to a file.

Output on your data sets:

C_A/I0/I0 INV has mismatch property

This lets Perl handle the file i/o. Also, a for loop is used to compare array contents--instead of grep--since it can be quickly terminated if a mismatch is found.

Kenosis
  • 6,196
  • 1
  • 16
  • 16
  • Thank you! I appreciate your detailed reply. Just nice. Plus, I definitely will implement the usage in your method. – annel Dec 09 '13 at 04:07
1

You could pack them, in a integer mode and then compare the packed values..

  unpack('s', $val1) != unpack('s', $val2);

Note from perldoc: But don't expect miracles: if the packed value exceeds the allotted byte capacity, high order bits are silently discarded, and unpack certainly won't be able to pull them back out of some magic hat. And, when you pack using a signed template code such as s, an excess value may result in the sign bit getting set, and unpacking this will smartly return a negative value.

Zach Leighton
  • 1,939
  • 14
  • 24
1

How about the smartmatch operator?

while (<$fh>) {
    my ($occurname, $tempname, @data1) = split;
    my $data = $ref_data{$occurname};
    print "$occurname $tempname has mismatch property\n" unless @$data ~~ @data1;
}

If your Perl is not new enough (< 5.10.1), just use TLP's idea.

EDIT: Added check for matching array lengths to stifle uninitialized value warnings when arrays are not the same size.

if (@data1 != @$data || grep { $data1[$_] != $data->[$_] } 0 .. $#data1) {
    print "$occurname $tempname has mismatch property\n";
}

See grep

Also section on arrays here for $#array

codnodder
  • 1,674
  • 9
  • 10
  • Thanks for your response. My perl is not new enough so I used the second option, error of `use of uninitialized value in numeric ne (!=)` occur. – annel Dec 09 '13 at 03:05
  • Thanks so much!! It works. May I know how this `grep { $data1[$_] != $data->[$_] } 0 .. $#data1;` works? – annel Dec 09 '13 at 03:22
  • It numerically compares each member of the array. grep returns the number of true matches in a scalar context. I will add a couple links to the posting. – codnodder Dec 09 '13 at 03:24
  • I see. Thank you so much for your tips and suggestion. – annel Dec 09 '13 at 03:28
1

Depending on how precise you need to be, I would just subtract the two and test for it being very close to zero:

if ( grep { my $delt= $data[$_] - $data1[$_] ;  return ( $delt < -1e-16 ) || ( $delt > 1e-16 ) ; } 1..$#data

Notice I changed the range from 0..$data to 1..$#data. You don't need to compare the first field since its text.

woolstar
  • 5,063
  • 20
  • 31