3

From wikipedia:

uniq
-c Generate an output report in default style except that each line is preceded by a count of the number of times it occurred. If this option is specified, the -u and -d options are ignored if either or both are also present.

On my machine it is taking the count number and putting it on the start of each line. What I want is for it to be placed at the end of the line, after a comma. How can this be done?

Example:

aa
aa
bb
cc
cc
dd

Should change to:

aa,2
bb,1
cc,2
dd,1
Stephen P
  • 14,422
  • 2
  • 43
  • 67
Jonah
  • 2,040
  • 7
  • 29
  • 32

3 Answers3

9

You can try something like this -

awk '{a[$1]++}END{for (i in a) print i,a[i] | "sort"}' OFS="," filename

or

awk -v OFS="," '{print $2,$1}' <(uniq -c file)

or

uniq -c file | awk '{printf("%s,%s\n",$2,$1)}'

or

while IFS=' +|,' read count text; do 
    echo "$text, $count"; 
done < <(uniq -c tmp)

Test:

[jaypal:~/Temp] cat file
aa
aa
bb
cc
cc
dd

[jaypal:~/Temp] awk '{a[$1]++}END{for (i in a) print i,a[i] | "sort"}' OFS="," file
aa,2
bb,1
cc,2
dd,1

Test2:

[jaypal:~/Temp] awk -v OFS="," '{print $2,$1}' <(uniq -c file)
aa,2
bb,1
cc,2
dd,1

Test3:

[jaypal:~/Temp] while IFS=' +|,' read count text; do 
echo "$text,$count"; 
done < <(uniq -c tmp)
aa,2
bb,1
cc,2
dd,1
jaypal singh
  • 74,723
  • 23
  • 102
  • 147
3

Simple things like this, sed is easier than awk

uniq -c inputfile.txt | sed -e 's/^ *\([0-9]\+\) \(.\+\)/\2,\1/'

Stephen P
  • 14,422
  • 2
  • 43
  • 67
  • 2
    ...although sometimes more cryptic. – Stephen P Jan 20 '12 at 23:13
  • Doesnt this make a second pass however to get it done? Is there a faster method which just does everything in one pass? Or are the passes done at the same time anyways? – Jonah Jan 20 '12 at 23:51
  • @Jonah - this and Jaypal's answer both use the Unix Way of Doing Things... constructing a pipeline of simple tools. You _are_ invoking two separate programs, but connecting the output of `uniq` to the input of `sed` they do execute concurrently - sed doesn't have to wait for uniq to finish, and on a multi-core or multi-processor machine the do actually execute in parallel. (I would have done his Test2 as `uniq -c file | awk -v blah-blah-blah` as I think it's clearer that you're doing the uniq followed by the awk (or sed)) – Stephen P Jan 21 '12 at 00:05
  • @Jonah If you want everything to be done in one-pass then my first solution does that. It does not use `uniq`. It stores your file in an array and uses a for loop to print the values out. It does sort them as it was how you wanted your output to look like. `awk` has associative arrays which causes the output to get printed randomly. – jaypal singh Jan 21 '12 at 00:23
  • Oh right. Thank you for correcting me. I deleted my comment. – Socowi May 05 '21 at 19:22
2

I'd use awk as I find it most readable

% uniq -c /path/to/input_file | awk -v OFS=',' '
{
    print $2, $1
}
'
aa,2
bb,1
cc,2
dd,1
johnsyweb
  • 136,902
  • 23
  • 188
  • 247