6

am very new to Perl and need your help

I have a CSV file xyz.csv with contents:

here level1 and er values are strings names...not numbers...

level1,er
level2,er2
level3,er3
level4,er4

I parse this CSV file using the script below and pass the fields to an array in the first run

open(my $d, '<', $file) or die "Could not open '$file' $!\n";
while (my $line = <$d>) {
  chomp $line; 
  my @data = split "," , $line; 
  @XYX = ( [ "$data[0]", "$data[1]" ], );
}

For the second run I take an input from a command prompt and store in variable $val. My program should parse the CSV file from the value stored in variable until it reaches the end of the file

For example

I input level2 so I need a script to parse from the second line to the end of the CSV file, ignoring the values before level2 in the file, and pass these values (level2 to level4) to the @XYX = (["$data[1]","$data[1]"],);}

level2,er2
level3,er3
level4,er4

I input level3 so I need a script to parse from the third line to the end of the CSV file, ignoring the values before level3 in the file, and pass these values (level3 and level4) to the @XYX = (["$data[0]","$data[1]"],);}

level3,er3
level4,er4

How do I achieve that? Please do give your valuable suggestions. I appreciate your help

hi123
  • 117
  • 2
  • 9
  • 1
    Duplicate (http://stackoverflow.com/questions/12212325/perl-script-to-read-a-csv-file-from-a-particular-row-to-end-of-file-in-perl/12213744#comment16361679_12213744) – Chris Charley Aug 31 '12 at 13:27

4 Answers4

4

As long as you are certain that there are never any commas in the data you should be OK using split. But even so it would be wise to limit the split to two fields, so that you get everything up to the first comma and everything after it

There are a few issues with your code. First of all I hope you are putting use strict and use warnings at the top of all your Perl programs. That simple measure will catch many trivial problems that you could otherwise overlook, and so it is especially important before you ask for help with your code

It isn't commonly known, but putting a newline "\n" at the end of your die string prevent Perl from giving file and line number details in the output of where the error occurred. While this may be what you want, it is usually more helpful to be given the extra information

Your variable names are verly unhelpful, and by convention Perl variables consist of lower-case alphanumerics and underscores. Names like @XYX and $W don't help me understand your code at all!

Rather than splitting to an array, it looks like you would be better off putting the two fields into two scalar variables to avoid all that indexing. And I am not sure what you intend by @XYX = (["$data[1]","$data[1]"],). First of all do you really mean to use $data[1] twice? Secondly, your should never put scalar variables inside double quotes, as it does something very specific, and unless you know what that is you should avoid it. Finally, did you mean to push an anonymous array onto @XYX each time around the loop? Otherwise the contents of the array will be overwritten each time a line is read from the file, and the earlier data will be lost

This program uses a regular expression to extract $level_num from the first field. All it does it find the first sequence of digits in the string, which can then be compared to the minimum required level $min_level to decide whether a line from the log is relevant

use strict;
use warnings;

my $file = 'xyz.csv';
my $min_level = 3;
my @list;

open my $fh, '<', $file or die "Could not open '$file' $!";

while (my $line = <$fh>) {
  chomp $line; 
  my ($level, $error) = split ',', $line, 2;
  my ($level_num) = $level =~ /(\d+)/;
  next unless $level_num >= $min_level;
  push @list, [ $level, $error ];
}
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • What will be the value of `$level_num` if `$level` does not contain a digit? – Oktalist Sep 01 '12 at 14:13
  • It would be undefined. My solution wouldn't work under those circumstances. How do you wish to compare such strings? – Borodin Sep 01 '12 at 16:45
  • That's not my question to answer, but not issuing a warning about an undefined value used in numeric comparison would be a good place to start. – Oktalist Sep 01 '12 at 20:30
  • 1
    I disagree. Silencing an error caused by malformed data is a bad idea. – Borodin Sep 01 '12 at 22:23
  • I do not suggest silencing a warning. I suggest writing the code so it does not cause a warning in the first place. `next unless defined($level_num) && $level_num >= $min_level;` – Oktalist Sep 01 '12 at 23:12
  • Oh, sorry, I see what you mean now. If you want malformed data to cause a warning, better to make it less cryptic than "undefined value used in numeric comparison". `unless (defined $level_num) { warn "Text in first column lacks numeric value\n"; next }` – Oktalist Sep 01 '12 at 23:41
  • hi borodin..thanks for your answer....the level and err are strings ....which is the name of level and erro names...its not number....how can we achieve the result...if its strings... – hi123 Sep 02 '12 at 06:55
  • @borodin borodin..thanks for your answer....the level and err are strings ....which is the name of level and erro names...its not number....how can we achieve the result...if its strings... – – hi123 Sep 02 '12 at 06:55
  • @hi123 Borodin's answer works with strings that contain numbers, like "level1", "level2", "level3". Or do you mean that the level names do not contain numbers at all, so like "levelfoo", "levelbar", "levelbaz"? Perhaps we've misunderstood your question. – Oktalist Sep 02 '12 at 11:44
  • @oktalist....levelnames and er names do not contain numbers at all....they r jus random names like...levelfoo levelbar... – hi123 Sep 02 '12 at 12:25
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/16154/discussion-between-hi123-and-oktalist) – hi123 Sep 02 '12 at 12:55
1

For deciding which records to process you can use the "flip-flop" operator (..) along these lines.

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

my $level = shift || 'level1';

while (<DATA>) {
  if (/^\Q$level,/ .. 0) {
    print;
  }
}

__DATA__
level1,er
level2,er2
level3,er3
level4,er4

The flip-flop operator returns false until its first operand is true. At that point it returns false until its second operand is true; at which point it returns false again.

I'm assuming that your file is ordered so that once you start to process it, you never want to stop. That means that the first operand to the flip-flop can be /^\Q$level,/ (match the string $level at the start of the line) and the second operand can just be zero (as we never want it to stop processing).

I'd also strongly recommend not parsing CSV records using split /,/. That may work on your current data but, in general, the fields in a CSV file are allowed to contain embedded commas which will break this approach. Instead, have a look at Text::CSV or Text::ParseWords (which is included with the standard Perl distribution).

Update: I seem to have got a couple of downvotes on this. It would be great if people would take the time to explain why.

Dave Cross
  • 68,119
  • 3
  • 51
  • 97
  • +1 for recommending CPAN modules for often-reimplemented solution – Oktalist Sep 01 '12 at 14:15
  • thanks for your answer....the level and err are strings ....which is the name of level and erro names...its not number....how can we achieve the result...if its strings... does this code work for string....if i give input level1 ..it should save the values aftr level1 till eof and pass it to @XYZ...can you pls help... – hi123 Sep 02 '12 at 06:57
  • My code assumes that the level identifier is a string. So it should just work. My code does nothing with the data other than identifying which records to process. You'll need to insert your processing where my example currently only contains `print;`. – Dave Cross Sep 02 '12 at 10:27
  • It does assume that the first level string is 'level1', though. I'll edit it. – Oktalist Sep 02 '12 at 11:57
  • use strict; use warnings; use 5.010; my $level = shift || 'level2'; my $file="C:\\Users\\hi.csv"; open my $DATA, '<', $file or die "Could not open '$file' $!\n"; while (my $line = <$DATA>) { if (/^\Q$level,/ .. 0) { print $line; chomp $line; my @data1 = split "," , $line; print $data1[0]; print $data1[1]; } i tried saving the contents of csv line by line to data...its giving error..can you pls advice.... – hi123 Sep 02 '12 at 12:53
  • can you pls advise ...how to tranverse the csv line by line ...from the levelname given as input to eof ....thus savin the values in an array while traversin line by line.. ... – hi123 Sep 02 '12 at 13:10
  • @Chris Charley pls check the above question in detail....pls advice... to tranverse the csv line by line ...from the levelname given as input to eof ....thus savin the values in an array while traversin line by line – hi123 Sep 02 '12 at 13:12
  • Never say "it's giving error". Instead, tell us what the error says. Be patient, and don't flood the comments with "pls advise". If you have to post a long chunk of code, consider editing your original question. Anyhow, I've started a new answer which includes all the best bits of previous answers so you shouldn't have any more problems. – Oktalist Sep 02 '12 at 13:41
  • 1
    My code `/^\Q$level/` expects the data to be in `$_`. You're putting it in `$line`. You'll need to fix that. You'll find a lot of Perl idioms work better if you get used to using `$_`. – Dave Cross Sep 02 '12 at 16:24
1
#!/usr/bin/perl

use strict;
use warnings;
use Text::CSV;

my @XYZ;
my $file = 'xyz.csv';
open my $fh, '<', $file or die "$file: $!\n";

my $level = shift; # get level from commandline
my $getall = not defined $level; # true if level not given on commandline

my $parser = Text::CSV->new({ binary => 1 }); # object for parsing lines of CSV

while (my $row = $parser->getline($fh)) # $row is an array reference containing cells from a line of CSV
{
  if ($getall # if level was not given on commandline, then put all rows into @XYZ
      or      # if level *was* given on commandline, then...
      $row->[0] eq $level .. 0 # ...wait until the first cell in a row equals $level, then put that row and all subsequent rows into @XYZ
     )
  {
    push @XYZ, $row;
  }
}

close $fh;
Oktalist
  • 14,336
  • 3
  • 43
  • 63
  • my $getall = not defined $level;my $parser = Text::CSV->new({ binary => 1 });while (my $row = $parser->getline($fh)) { if ($getall or $row->[0] eq $level .. 0) { pls explain these lines...thnks.. – hi123 Sep 02 '12 at 18:24
  • 1
    @hi123: I added some comments to the code which I hope will explain things for you. – Oktalist Sep 02 '12 at 18:58
  • @oktalist...how do i print $xyz[0],$xyz[1]...print $xyz[0] is givin o/p ARRAY(0x19466d4)ARRAY(0x19466d4)...can pls help.. – hi123 Sep 03 '12 at 06:15
  • 1
    @hi123 `@XYZ` is an array of arrays, so you must `print $XYZ[0][0], $XYZ[0][1], $XYZ[1][0], $XYZ[1][1], $XYZ[2][0], $XYZ[2][1]` and so on, or in a loop like `for my $inner (@XYZ) { print $inner->[0], $inner->[1] }` – Oktalist Sep 03 '12 at 11:56
1
#!/usr/bin/perl  
use strict;     
use warnings;
open(my $data, '<', $file) or die "Could not open '$file' $!\n"; 
my $level = shift ||"level1"; 
while (my $line = <$data>) {  
chomp $line; 
my @fields = split "," , $line; 
if($fields[0] eq $level .. 0){
print "\n$fields[0]\n";
print "$fields[1]\n";
}}

This worked....thanks ALL for your help...

hi123
  • 117
  • 2
  • 9