Questions tagged [uniq]

uniq is a Unix/POSIX/Linux utility to remove or filter duplicate lines from a sorted file. It is also the name of a method to remove duplicates from an array in Ruby.

uniq is a Unix/POSIX/Linux utility to remove or filter duplicate lines from a sorted file. It is typically applied to the output of sort.

In Ruby ruby, uniq is a method of the Array class to remove duplicates from an array. uniq creates a new array whereas uniq! modifies the array in place.

For questions about unique identifiers, keys, names, etc., see unique or more specific tags such as unique-constraint, unique-index, unique-key, unique-ptr, etc.

Documentation

POSIX.1-2008
GNU coreutils (used on Linux, Cygwin)
FreeBSD
BusyBox

454 questions

votes

3 answers

how to aggregate counts in a bash one-liner

I often use sort | uniq -c to make count statistics. Now, if I have two files with such count statistics, I would like to put them together and add the counts. (I know I could append the original files and count there, but lets assume only the count…

bash unix uniq

asked Mar 13 '14 at 15:52

benroth

2,468
3
24
25

votes

1 answer

"Illegal Byte sequence" error while using shell commands in mac bash terminal

Getting "illegal byte sequence" error while trying to extract non English characters from a large file in MacOS bash shell. This is the script that I am trying to use: sed 's/[][a-z,0-9,A-Z,!@#\$%^&*(){}":/_-|. -][\;''=?]*//g' < $1…

bash shell unix sed uniq

asked Sep 23 '13 at 07:17

Abhineet Prasad

1,271
2
11
14

votes

4 answers

Merge results from uniq -c

I have many files with results of command: uniq -c some_file > some_file.out For example: 1.out: 1 a 2 b 4 c 2.out 2 b 8 c I would like to merge these results, so I get: 1 a 4 b 12 c I thought that sort or uniq could handle it but I…

linux merge sorting uniq

asked Sep 25 '09 at 09:34

radarek

2,478
2
17
12

votes

1 answer

Even after `sort`, `uniq` is still repeating some values

Reference file: http://snap.stanford.edu/data/wiki-Vote.txt.gz (It is a tape archive that contains a file called Wiki-Vote.txt) The first few lines in the file that contains the following, head -n 10 Wiki-Vote.txt # Directed graph (each unordered…

linux posix carriage-return uniq

asked Jan 13 '20 at 13:47

SigSegV

votes

1 answer

Use case for uniq, groupby without sorting

While debugging a Python programme, I recently discovered that the Python itertools#groupby() function requires the input collection to be sorted, because it only groups identical elements that occur in a sequence: Generally, the iterable needs to…

python sorting grouping uniq

asked Jun 08 '19 at 14:02

Carsten

1,912
1
28
55

votes

3 answers

How to find duplicate lines in a file?

I have an input file with foillowing data: line1 line2 line3 begin line5 line6 line7 end line9 line1 line3 I am trying to find all the duplicate lines , I tried sort filename | uniq -c but does not seem to be working for me : It gives me : 1…

sorting uniq

asked Jan 09 '17 at 12:20

Vicky

1,298
1
16
33

votes

3 answers

Finding a uniq -c substitute for big files

I have a large file (50 GB) and I could like to count the number of occurrences of different lines in it. Normally I'd use sort bigfile | uniq -c but the file is large enough that sorting takes a prohibitive amount of time and memory. I could…

bash shell uniq gnu-toolchain linux-toolchain

asked Sep 02 '15 at 22:22

Charles

11,269
13
67
105

votes

3 answers

Sort and keep a unique duplicate which has the highest value

I have a file like the one shown below, I want to keep the combinations between the first and second field which has the highest value on the third field(the ones with the arrows, arrows are not included in the actual file) . 1 1 10 1 1 12 …

sorting unix uniq

asked Apr 02 '14 at 20:35

Tamalero

votes

5 answers

What is the difference between 'sort -u' and 'uniq'?

I need script that sorts a text file and remove the duplicates. Most, if not all, of the examples out there use the sort file1 | uniq > file2 approach. In the man sort though, there is an -u option that does this at the time of sorting. Is there a…

bash sorting uniq

asked Mar 09 '14 at 20:40

Stoinov

votes

2 answers

bash add up columns with same first column

I have a file that has a name in the first column and count in the second column. It is sorted by name. dan 3355 dan 667 dan 889 frank 8 frank 99 frank 90 ian 9 I would like to combine all the same names and output the…

bash unix uniq

asked Nov 30 '12 at 17:12

user1190650

3,207
6
27
34

votes

3 answers

Bash output the line with highest value

my question is pretty much like this one but with one difference; i want the output the line that has highest score on the 3rd tab. my data is like: 1.gui Qxx 16 2.gui Qxy 23 3.guT QWS 11 and i want to get this: 1.gui Qxy 23 3.guT QWS …

linux bash sorting uniq

asked Nov 27 '12 at 09:36

teutara

votes

1 answer

Calculate Word occurrences from file in bash

I'm sorry for the very noob question, but I'm kind of new to bash programming (started a few days ago). Basically what I want to do is keep one file with all the word occurrences of another file I know I can do this: sort | uniq -c | sort the thing…

linux bash shell uniq

asked Aug 07 '12 at 17:08

Epi

votes

5 answers

Removing lines containing a unique first field with awk?

Looking to print only lines that have a duplicate first field. e.g. from data that looks like this: 1 abcd 1 efgh 2 ijkl 3 mnop 4 qrst 4 uvwx Should print out: 1 abcd 1 efgh 4 qrst 4 uvwx (FYI - first field is not always 1 character long in my…

sorting sed awk grep uniq

asked Feb 25 '11 at 23:24

Kyle

votes

1 answer

How get unique lines from a very large file in linux?

I have a very large data file (255G; 3,192,563,934 lines). Unfortunately I only have 204G of free space on the device (and no other devices I can use). I did a random sample and found that in a given, say, 100K lines, there are about 10K unique…

linux large-files uniq

asked Jul 27 '17 at 17:30

Sir Robert

4,686
7
41
57

votes

5 answers

How to completely erase the duplicated lines by linux tools?

This question is not equal to How to print only the unique lines in BASH? because that ones suggests to remove all copies of the duplicated lines, while this one is about eliminating their duplicates only, i..e, change 1, 2, 3, 3 into 1, 2, 3…

python awk sed grep uniq

asked Dec 01 '16 at 17:25

Evandro Coan

8,560
11
83
144

Prev 1 2

…

30 31 Next