How do you do custom formatting with the uniq -c option?

Question

From wikipedia:

uniq
-c Generate an output report in default style except that each line is preceded by a count of the number of times it occurred. If this option is specified, the -u and -d options are ignored if either or both are also present.

On my machine it is taking the count number and putting it on the start of each line. What I want is for it to be placed at the end of the line, after a comma. How can this be done?

Example:

aa
aa
bb
cc
cc
dd

Should change to:

aa,2
bb,1
cc,2
dd,1

jaypal singh · Accepted Answer · 2012-01-21T20:30:10.687

You can try something like this -

awk '{a[$1]++}END{for (i in a) print i,a[i] | "sort"}' OFS="," filename

or

awk -v OFS="," '{print $2,$1}' <(uniq -c file)

or

uniq -c file | awk '{printf("%s,%s\n",$2,$1)}'

or

while IFS=' +|,' read count text; do 
    echo "$text, $count"; 
done < <(uniq -c tmp)

Test:

[jaypal:~/Temp] cat file
aa
aa
bb
cc
cc
dd

[jaypal:~/Temp] awk '{a[$1]++}END{for (i in a) print i,a[i] | "sort"}' OFS="," file
aa,2
bb,1
cc,2
dd,1

Test2:

[jaypal:~/Temp] awk -v OFS="," '{print $2,$1}' <(uniq -c file)
aa,2
bb,1
cc,2
dd,1

Test3:

[jaypal:~/Temp] while IFS=' +|,' read count text; do 
echo "$text,$count"; 
done < <(uniq -c tmp)
aa,2
bb,1
cc,2
dd,1

score 3 · Answer 2 · answered Jan 20 '12 at 23:12

3

Simple things like this, sed is easier than awk

uniq -c inputfile.txt | sed -e 's/^ *\([0-9]\+\) \(.\+\)/\2,\1/'

answered Jan 20 '12 at 23:12

Stephen P

14,422
2
43
67

2

...although sometimes more cryptic. – Stephen P Jan 20 '12 at 23:13
Doesnt this make a second pass however to get it done? Is there a faster method which just does everything in one pass? Or are the passes done at the same time anyways? – Jonah Jan 20 '12 at 23:51
@Jonah - this and Jaypal's answer both use the Unix Way of Doing Things... constructing a pipeline of simple tools. You _are_ invoking two separate programs, but connecting the output of `uniq` to the input of `sed` they do execute concurrently - sed doesn't have to wait for uniq to finish, and on a multi-core or multi-processor machine the do actually execute in parallel. (I would have done his Test2 as `uniq -c file | awk -v blah-blah-blah` as I think it's clearer that you're doing the uniq followed by the awk (or sed)) – Stephen P Jan 21 '12 at 00:05
@Jonah If you want everything to be done in one-pass then my first solution does that. It does not use `uniq`. It stores your file in an array and uses a for loop to print the values out. It does sort them as it was how you wanted your output to look like. `awk` has associative arrays which causes the output to get printed randomly. – jaypal singh Jan 21 '12 at 00:23
Oh right. Thank you for correcting me. I deleted my comment. – Socowi May 05 '21 at 19:22

score 2 · Answer 3 · answered Jan 21 '12 at 00:23

2

I'd use awk as I find it most readable

% uniq -c /path/to/input_file | awk -v OFS=',' '
{
    print $2, $1
}
'
aa,2
bb,1
cc,2
dd,1

answered Jan 21 '12 at 00:23

johnsyweb

136,902
23
188
247

How do you do custom formatting with the uniq -c option?

3 Answers3

Test:

Test2:

Test3: