Delete duplicate headers in awk

Question

I used used cat to combine several files and they all have the same headers. Is there anyway I can retain the 1st occurrence of the header and delete the succeeding headers inside the concatenated file?

Thanks!

Example:

FirstName, LastName, Phone, Zip
(data)
(data)
(data)
FirstName, LastName, Phone, Zip
(data)
(data)
(data)

score 0 · Answer 1 · answered Nov 10 '14 at 05:23

0

You can do this:

cp file1 result
tail -q -n +2 file2 file3 file4 >> result

That is, start with the entire contents of file1, then append from the other files starting with line 2 of each. This way you avoid the need to try to find the extra headers and delete them later.

If you prefer, here's another formulation of the same:

head -1 file1 > result
tail -q -n +2 file1 file2 file3 file4 >> result

answered Nov 10 '14 at 05:23

John Zwinck

239,568
38
324
436

This worked well, though I forgot to mention that I have 67 files that needs to be concatenated, it would be tedious for me if used it. Thank you BTW. – Johann Nov 10 '14 at 08:20
If you found the answer helpful and interesting you could upvote it. :) – John Zwinck Nov 10 '14 at 08:21

score 0 · Answer 2 · answered Nov 10 '14 at 05:23

0

Try this:

sed -e '2,$s/FirstName, LastName, Phone, Zip//g' -e '/^$/d' Yourfile.txt

You can replace "FirstName, LastName, Phone, Zip" with whatever header you have. From 2nd line to end of file, it will remove the header patter with , then delete the blank lines with /^$/d'

answered Nov 10 '14 at 05:23

Arjun Mathew Dan

5,240
1
16
27

Removing the headers worked, although it didn't remove the blank spaces. Thank you BTW. – Johann Nov 10 '14 at 08:18

score 0 · Accepted Answer · answered Nov 10 '14 at 06:17

0

I'd do it this way:

sed '1h;2,$G;s/^\(.*\)\n\1$//;/./P;d' filename

answered Nov 10 '14 at 06:17

Beta

96,650
16
149
150

@potong: I tried that approach before I posted and couldn't get it to work-- then I saw your comment, tried it again, avoided a couple of pitfalls I'd noticed *after* deciding against that approach, and now it works! Simpler, and one character shorter, good catch. – Beta Nov 10 '14 at 13:55

score 0 · Answer 4 · answered Nov 10 '14 at 08:43

Here is an awk version. It will skipp all line with FirstName except line 1

awk 'NR>1 && /^FirstName/ {next}1' file
FirstName, LastName, Phone, Zip
(data)
(data)
(data)
(data)
(data)
(data)

If the header line do changing, we need a pattern to follow.

score 0 · Answer 5 · answered Nov 10 '14 at 09:07

0

awk way

awk '!a[$0];NR==1{a[$0]++}' file

answered Nov 10 '14 at 09:07

Delete duplicate headers in awk

5 Answers5