Number of non repeating lines - unique count

Question

Here is my problem: Any number of lines of text is given from standard input. Output: number of non repeating lines

INPUT:

She is wearing black shoes.
My name is Johny.
I hate mondays.
My name is Johny.
I don't understand you.
She is wearing black shoes.

OUTPUT:

Ding · Answer 1 · 2013-05-01T22:46:31.280

114

You could try using uniq man uniq and do the following

sort file | uniq -u | wc -l

edited May 01 '13 at 22:46

answered May 01 '13 at 22:38

Ding

3,065
1
16
27

1

I added the `sort` command in the mix. Nice catch...I had it out of order – Ding May 01 '13 at 22:46
20

in the man pages it states: Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use `sort -u' without `uniq'. Also, comparisons honor the rules specified by `LC_COLLATE'. It worked also.... – nils petersohn Feb 12 '14 at 12:55
1

In my case, doing `sort file | uniq -u` gives different output than `sort -u file` for the same file. `sort -u file` gave the correct output. – Zimano Oct 24 '19 at 16:06

score 8 · Answer 2 · answered May 01 '13 at 23:13

Here's how I'd solve the problem:

... | awk '{n[$0]++} END {for (line in n) if (n[line]==1) num++; print num}'

But that's pretty opaque. Here's a (slightly) more legible way to look at it (requires bash version 4)

... | {
    declare -A count    # count is an associative array

    # iterate over each line of the input
    # accumulate the number of times we've seen this line
    #
    # the construct "IFS= read -r line" ensures we capture the line exactly

    while IFS= read -r line; do
        (( count["$line"]++ ))
    done

    # now add up the number of lines who's count is only 1        
    num=0
    for c in "${count[@]}"; do
        if (( $c == 1 )); then
            (( num++ ))
        fi
    done

    echo $num
}

on my '99 machine the awk solution worked seamlessly – fiorentinoing Aug 14 '16 at 21:20 — fiorentinoing, Aug 14 '16 at 21:20
@sfiore, what's a "'99 machine"? – glenn jackman Aug 21 '16 at 01:23 — glenn jackman, Aug 21 '16 at 01:23

Number of non repeating lines - unique count

2 Answers2

Linked