How to generate list of unique lines in text file using a Linux shell script?

Question

Suppose I have a file that contain a bunch of lines, some repeating:

line1
line1
line1
line2
line3
line3
line3

What linux command(s) should I use to generate a list of unique lines:

line1
line2
line3

Does this change if the file is unsorted, i.e. repeating lines may not be in blocks?

score 35 · Accepted Answer · answered May 30 '13 at 16:06

35

If you don't mind the output being sorted, use

sort -u

This sorts and removes duplicates

answered May 30 '13 at 16:06

parkydr

7,596
3
32
42

score 11 · Answer 2 · answered May 30 '13 at 16:07

11

cat to output the contents, piped to sort to sort them, piped to uniq to print out the unique values:

cat test1.txt | sort | uniq

you don't need to do the sort part if the file contents are already sorted.

answered May 30 '13 at 16:07

go-oleg

19,272
3
43
44

score 7 · Answer 3 · answered Apr 10 '18 at 06:08

7

Create a new sort file with unique lines :

sort -u file >> unique_file

Create a new file with uniques lines (unsorted) :

cat file | uniq >> unique_file

answered Apr 10 '18 at 06:08

Kevin Sabbe

1,412
16
24

score 1 · Answer 4 · answered Mar 14 '19 at 12:08

If we do not care about the order, then the best solution is actually:

sort -u file

If we also want to ignore the case letter, we can use it (as a result all letters will be converted to uppercase):

sort -fu file

It would seem that even a better idea would be to use the command:

uniq file

and if we also want to ignore the case letter (as a result the first row of duplicates is returned, without any change in case):

uniq -i file

However, in this case, may be returned a completely different result, than in case when we use the sort command, because uniq command does not detect repeated lines unless they are adjacent.

How to generate list of unique lines in text file using a Linux shell script?

4 Answers4