0

I have multiple text files with different data, but same header & bottom text. I have to remove the header and tail text and merge them into one output file. Any one liner with decent speed would be good. All of the file names Start with the name ABC, and are in the same directory.

Example File1:

This is a sample header
This is Not required
I have to remove this data

....... DATA of file 1 .........

This is sample tail 
It needs to be removed

Example File2:

This is a sample header
This is Not required
I have to remove this data

....... DATA of file 2 .........

This is sample tail 
It needs to be removed

I am using

head -n -12 ABC.txt | tail -n +20 > output.txt 

but it processes only 1 file. (12 lines to be removed from bottom, 20 to be removed from top)

Rubén
  • 34,714
  • 9
  • 70
  • 166
Muz
  • 73
  • 1
  • 8
  • Are the blank lines you show included or excluded? Does the data contain blank lines? Is the pattern that starts the tail section of a file readily identifiable? Could it ever occur in the data section of a file? – Jonathan Leffler Jan 14 '14 at 22:16
  • No they are not included. You can write a sample line of command, i can adjust as per need, as how much to omit. – Muz Jan 14 '14 at 22:23
  • Sorry, but you're expected to show that you've attempted to solve your problem on your own. Consider editing your question to include your best attempt at solving your problem. Good luck. – shellter Jan 14 '14 at 23:03
  • I am using " head -n -12 ABC.txt | tail -n +20 > output.txt " .. but it processes only 1 file. (12 lines to be removed from bottom, 20 to be removed from top) – Muz Jan 15 '14 at 00:47

2 Answers2

2

Assuming all the files have a 20 line header, and 12 line footer, you can use sed to extract the 21st line through the 13th to last line:

for file in ABC*; do
    numlines=$(cat $file | wc -l)
    lastline=$(( $numlines - 12 ))
    (( 21 <= $lastline )) && sed "21,$lastline \!D" $file >> combined.txt
done

Files that only have the header and footer, but no additional lines, produce no output. If you prefer to use your head and tail commands instead of sed:

for file in ABC*; do
    numlines=$(cat $file | wc -l)
    (( 32 < $numlines )) && head -n -12 $file | tail -n +20 >> combined.txt
done
Markku K.
  • 3,840
  • 19
  • 20
0

Use ABC* instead of ABC.txt. If you use ABC.txt, it will process only that file. If you use ABC* it will process all the files starting with ABC.

SARATH
  • 51
  • 8
  • I already did, it just processed the head of first file and tail of last one. All the other headers / tails i dont want are in the last combined file. – Muz Jan 15 '14 at 17:38