8

Given the following table

 123456.451 entered-auto_attendant
 123456.451 duration:76 real:76
 139651.526 entered-auto_attendant
 139651.526 duration:62 real:62`
 139382.537 entered-auto_attendant 

Using a bash shell script based in Linux, I'd like to delete all the rows based on the value of column 1 (The one with the long number). Having into consideration that this number is a variable number

I've tried with

awk '{a[$3]++}!(a[$3]-1)' file

sort -u | uniq

But I am not getting the result which would be something like this, making a comparison between all the values of the first column, delete all the duplicates and show it

 123456.451 entered-auto_attendant
 139651.526 entered-auto_attendant
 139382.537 entered-auto_attendant 
mklement0
  • 382,024
  • 64
  • 607
  • 775
user3494949
  • 91
  • 1
  • 2
  • 4

4 Answers4

8

you didn't give an expected output, does this work for you?

 awk '!a[$1]++' file

with your data, the output is:

123456.451 entered-auto_attendant
139651.526 entered-auto_attendant
139382.537 entered-auto_attendant

and this line prints only unique column1 line:

 awk '{a[$1]++;b[$1]=$0}END{for(x in a)if(a[x]==1)print b[x]}' file

output:

139382.537 entered-auto_attendant
Kent
  • 189,393
  • 32
  • 233
  • 301
6

uniq, by default, compares the entire line. Since your lines are not identical, they are not removed.

You can use sort to conveniently sort by the first field and also delete duplicates of it:

sort -t ' ' -k 1,1 -u file
  • -t ' ' fields are separated by spaces
  • -k 1,1: only look at the first field
  • -u: delete duplicates

Additionally, you might have seen the awk '!a[$0]++' trick for deduplicating lines. You can make this dedupe on the first column only using awk '!a[$1]++'.

that other guy
  • 116,971
  • 11
  • 170
  • 194
  • Upvoting this answer as I think its a bit more flexible. You could dedupe across multiple fields for example. Thats harder to do with awk. – catch22 Jul 11 '23 at 02:58
1

Using awk:

awk '!($1 in a){a[$1]++; next} $1 in a' file
123456.451 duration:76 real:76
139651.526 duration:62 real:62
jaypal singh
  • 74,723
  • 23
  • 102
  • 147
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Good, but I'd like to have all the records that start with the same column, like in the description, in that case are 2 records with the same first column, but sometimes may be three or more – user3494949 Apr 03 '14 at 22:33
  • Isn't that what this answer is already doing. It is printing all the duplicate lines. What is your expected output? – anubhava Apr 03 '14 at 22:38
1

try this command

awk '!x[$1]++ { print $1, $2 }' file
J. Chomel
  • 8,193
  • 15
  • 41
  • 69