0

I have a large EBCDIC file which can be between 100mb to 900mb. Each line has a fixed lenght of 499 chars. At the end of the line is one byte hex(0A) which represents RPT = line feed. The first two rows differ from the 499 char fixed lenght.

What is the most performant way to iterate over all lines and output each line, which is not exact 499 chars (in any language, bash prefered).

Thanks very much!

David Ruhmann
  • 11,064
  • 4
  • 37
  • 47
vo1d
  • 2,723
  • 2
  • 20
  • 17

1 Answers1

3

How about short perl script:

#!/bin/perl
while(<STDIN>){
 if(length($_)!=499){
  print $_;
 }
}
wwn
  • 563
  • 5
  • 9
  • "which is not exact 499" this sounds to me like printing the ones that are of different length than 499 – wwn Oct 16 '13 at 08:12
  • Thank you. How do I run this with my file? If i run ./foo myfile it's not working. Is the a smart way to pipe my file to stdin of this script? – vo1d Oct 16 '13 at 12:00
  • cat yourfile | file_with_script.pl – wwn Oct 16 '13 at 21:44