linux + delete duplicate IP's from file

Question

what the best way to remove duplicate IP's from file

I use the command:

        sort file | uniq

but I am not sure if this is the best way , maybe I missed something?

remark: my file contain two fields

example of file

 172.17.200.1 3.3.3.3
 172.17.200.1 3.3.3.3
 255.255.255.0 255.255.255.111
 255.255.255.0 255.255.255.111
 172.17.200.2 3.3.3.4
 255.255.255.0 255.255.255.111
 172.17.200.3 3.3.3.5
 255.255.255.0 255.255.255.111
 172.17.200.4 3.3.3.7
 255.255.255.0 255.255.255.111
 172.17.200.5 3.3.3.8
 255.255.255.0 255.255.255.111
 255.255.255.0 255.255.255.111
 172.17.200.1 3.3.3.3
 255.255.255.0 255.255.255.111
 172.17.200.2 3.3.3.4
 255.255.255.0 255.255.255.111
 172.17.200.3 3.3.3.5
 255.255.255.0 255.255.255.111
 172.17.200.4 3.3.3.7
 255.255.255.0 255.255.255.111
 172.17.200.5 3.3.3.8
 255.255.255.0 255.255.255.111
 255.255.255.0 255.255.255.111

score 5 · Accepted Answer · answered May 02 '13 at 13:08

5

I believe something as simple as 'sort -u ' should work for you

#sort -u /tmp/test

172.17.200.1 3.3.3.3
172.17.200.2 3.3.3.4
172.17.200.3 3.3.3.5
172.17.200.4 3.3.3.7
172.17.200.5 3.3.3.8
255.255.255.0 255.255.255.111

Check the 'sort' manpage for more info:

-u, --unique
with -c, check for strict ordering; without -c, output only the first of an equal run

answered May 02 '13 at 13:08

Chad

96
3

2

Note: This does almost the exact same thing as `sort | uniq` and should be just as fast unless you're running the commands many times. – Chris S May 02 '13 at 14:02

score 0 · Answer 2 · answered May 02 '13 at 13:06

0

Try

:%s/^\(.*\)\n\1$/\1/

This basically compares lines on a file in vi

answered May 02 '13 at 13:06

german_guy

11
3

how we can be sure that your code is better then sort file | uniq ? – yael May 02 '13 at 13:11
You can´t, but if you are already in an editor like vi you can kill two birds with one stone since you do not have to return to the shell. This is the great part of *nixes - you can do the same task at least in two different ways :) – german_guy May 02 '13 at 14:18
@yael If you care that much you can benchmark it on your workload, but honestly though you're investing more effort in this than it's worth. `sort | uniq` works, it works on every *nix variant I've ever encountered (including ones where `sort` doesn't understand `-u`), and it's generally "fast enough" on all but the most insanely huge data sets. Sometimes it's better to not worry about optimization :-) – voretaq7 May 02 '13 at 19:29

linux + delete duplicate IP's from file

2 Answers2