-2

I have a file that looks like

NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT

I want to separate the records by country. I have stored each line into array variable @fields

my @fields = split(/\|/, $_ );

making $fields[3] as my basis for sorting it. I wanted it to separate into 2 output text files

OUTPUT TEXT FILE 1:

NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT

OUTPUT TEXT FILE 2

NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE

Putting all that is from JPN to output text 1 & non-JPN country to output text file 2

here's the code that what trying to work out

use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);

my @fields;
my $tmp_var;
my $count;
;
my ($line, $i);

my $filename = 'data.txt';
open(my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";


open(OUTPUTA, ">", 'JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";
open(OUTPUTB, ">", 'Non-JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";

my $fh;
while (<$input_fh>) {

    chomp;
   my @fields = split /\|/;


   if ($fields[0] eq 'NAME') {
    for ($i=1; $i < @fields; $i++) {
        if ($fields[3] eq 'JPN') {
           $fh = $_;
            print OUTPUTA $fh;
        }
        else {
           $fh = $_;
            print OUTPUTB $fh;
        }
    }

}   
}

close(OUTPUTA);
close(OUTPUTB)

Still has no luck on it :(

Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
Soncire
  • 309
  • 1
  • 5
  • 15

4 Answers4

1

You didn't say what you needed help with, so I'm assuming it's coming up with an algorithm. Here's a good one:

  1. Open the file to read.
  2. Open the file for the JPN entries.
  3. Open the file for the non-JPN entries.
  4. While not eof,
    1. Read a line.
    2. Parse the line.
    3. If it's the first line of a record,
      1. If the person's country is JPN,
        1. Set current file handle to the file handle for JPN entries.
      2. Else,
        1. Set current file handle to the file handle for non-JPN entries.
    4. Print the line to the current file handle.

my $jpn_qfn   = '...';
my $other_qfn = '...';

open(my $jpn_fh,   '>', $jpn_qfn)
   or die("Can't create $jpn_qfn: $!\n");
open(my $other_fh, '>', $other_qfn)
   or die("Can't create $other_qfn: $!\n");

my $fh;
while (<>) {
   chomp;
   my @fields = split /\|/;
   if ($fields[0] eq 'NAME') {
      $fh = $fields[3] eq 'JPN' ? $jpn_fh : $other_fh;
   }

   say $fh $_;
}   
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • since I'm new in perl can you show to me how will I extract each 3 lines – Soncire Mar 20 '13 at 01:51
  • you don't have to; changing what file you are writing to on the first line of a record (steps 4.3.1.1 and 4.3.2.1) automatically make the next two lines go to the right place – ysth Mar 20 '13 at 02:16
  • @Soncire, Where do you see "extract 3 lines" anywhere in what I posted? – ikegami Mar 20 '13 at 03:55
  • Since `<$fh>` reads one line, `<$fh>` three times would read three lines. – ikegami Mar 20 '13 at 03:56
1

Here is the way I think ikegami was saying, but I've never tried this before (although it gave the correct results).

#!/usr/bin/perl
use strict;
use warnings;

open my $jpn_fh, ">", 'o33.txt' or die $!;
open my $other_fh, ">", 'o44.txt' or die $!;

my $fh;
while (<DATA>) {
    if (/^NAME/) {
        if (/JPN$/) {
            $fh = $jpn_fh;  
        }
        else {
            $fh = $other_fh;
        }
    }
    print $fh $_;
}   

close $jpn_fh or die $!;
close $other_fh or die $!;

__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
ikegami
  • 367,544
  • 15
  • 269
  • 518
Chris Charley
  • 6,403
  • 2
  • 24
  • 26
  • yep that solved my problem chris, can you please write a comment on each line so I can I understand your code thank you very much – Soncire Mar 20 '13 at 02:31
  • If the line you've read begins with NAME, (/^NAME/), then if the same line ends in JPN, (/JPN$/), set the filehandle to $jpn, otherwise set it to $other. Then the print below will direct it to the correct file. – Chris Charley Mar 20 '13 at 02:41
  • thanks chris I have a subroutine that removes spaces & other stuff sub _trim { my $word = shift; if ( $word ) { $word =~ s/\A\s+|\s+\z//g; $word =~ s/\s+/ /g; $word =~ s/\|\s*/\|/g; $word =~ s/\s*\|/\|/g; $word =~ s/\$\s+/\$/g; $word =~ s/^\s+//; $word =~ s/"//g; } return $word; } how will I embed it to your code? – Soncire Mar 20 '13 at 02:57
  • and 1 more what if the line doesn't end on JPN? what if the line looks like this NAME|JOHN|JPN|TOKYO – Soncire Mar 20 '13 at 03:00
0
#!/usr/bin/env perl

use 5.012;
use autodie;
use strict;
use warnings;

# store per country output filehandles
my %output;

# since this is just an example, read from __DATA__ section

while (my $line = <DATA>) {
    # split the fields
    my @cells = split /[|]/, $line;

    # if first field is NAME, this is a new record
    if ($cells[0] eq 'NAME') {
        # get the country code, strip trailing whitespace
        (my $country = $cells[3]) =~ s/\s+\z//;

        # if we haven't created and output file for this
        # country, yet, do so
        unless (defined $output{$country}) {
            open my $fh, '>', "$country.out";
            $output{$country} = $fh;
        }
        my $out = $output{$country};

        # output this and the next two lines to
        # country specific output file
        print $out $line, scalar <DATA>, scalar <DATA>;
    }
}

close $_ for values %output;

__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
0

Thanks for your Help heaps I was able to solved this problem in perl, many thanks

#!/usr/local/bin/perl

use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);

my @fields;
my $tmp_var;
my ($rec_type, $country);

my $filename = 'data.txt';


open (my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";


open  my $OUTPUTA, ">", 'o33.txt' or die $!;
open  my $OUTPUTB, ">", 'o44.txt' or die $!;

my $Combline;
while (<$input_fh>) {

    $_ = _trim($_); 
    @fields = split (/\|/, $_); 
    $rec_type = $fields[0];
    $country = $fields[3];

        if ($rec_type eq 'NAME') {          
            if ($country eq 'JPN') {                            
                *Combline = $OUTPUTA;
            }           
            else {                              
                *Combline = $OUTPUTB;
            }
        }       
   print  Combline;
}   

close $OUTPUTA or die $!;
close $OUTPUTB or die $!;

sub _trim {
    my $word = shift;
    if ( $word ) {      
        $word =~ s/\s*\|/\|/g;      #remove trailing spaces
        $word =~ s/"//g;        #remove double quotes
    }
    return $word;
}
Soncire
  • 309
  • 1
  • 5
  • 15