0

I have a txt file in which lots of information is given. I want to read and store just 'status' part.

Example:

id........username...... status......language .......image  

11111 abcdefg Man Utd won for the second time ENG img1244

11112 abcdaaa Man Utd won for the third  time ENG img1245 

11113 abcdbbb Man Utd won for the fourth time ENG img1246

11114 abcdccc Man Utd won for the fifth  time ENG img1247 

11115 abcdddd Man Utd won for the sixth  time ENG img1248 

And what I should obtain is the following

Man Utd won for the second time 

Man Utd won for the third  time 

Man Utd won for the fourth time

Man Utd won for the fifth  time

Man Utd won for the sixth  time

What I want to do is storing the string data from username to 'ENG' string.

Thanks for your help.

Joe
  • 15,205
  • 8
  • 49
  • 56

1 Answers1

0

You can do this with a simple perl script. For windows, perl can be downloaded from activestate. Linux usually already has perl installed.

To use:

  1. install (or already have) perl
  2. copy the script below into a text file
  3. Save the file using a simple name of your choice with an extension of .pl (ex: parser.pl)
  4. Save the source file into the same directory and name it 'input.txt'
  5. From a cmd window execute: perl parser.pl
  6. The results of the script will be created in a file called 'output.txt' (in the same directory) and if the file exists will be overwritten.

The script assumes that:

  1. the text your looking for starts with Man or Woman
  2. the ENG text does not appear in the text you are looking for, only at the end.
  3. the language text is always ENG. IF not replace ENG with (?:ENG|OTHER1|OTHER2|ETC) on line 18

The script:

!/usr/local/bin/perl

use strict;

unless(open(INFILE, "input.txt")){
  print "Unable to open input file input.txt for reading, possible reason: $!\n";
  exit;
};

unless(open(OUTFILE, ">output.txt")){
  print "Unable to open output file output.txt for writing, possible reason: $!\n";
  exit;
};

my $x = 1;
foreach my $line (<INFILE>){
   print "$line";
   if($line =~ /((?:Wom|M)an.*) ENG/){
      print OUTFILE $1."\n";
   }else{
      print "No match found on line $x\n";
   }
   $x++;
}

close(INFILE);
close(OUTFILE);
exit;
Community
  • 1
  • 1
Drew
  • 4,215
  • 3
  • 26
  • 40
  • Thank you for your answer and help but what if I don't have a fixed beginging sentence such as "Man" or "Woman"? It is different for everyline and all I know is usernames end with a number (0-9). It is like : 11111 abcdef9 Man Utd won for the second time ENG img1244 – Ender Tinkir Jun 13 '13 at 11:59
  • The script I came up with doesn't care about anything before the Man or Woman portion and will work for all of the text syntax examples you have given. Run it against your actual file and let me know how it works. – Drew Jun 24 '13 at 11:23