Merge results from uniq -c

Question

I have many files with results of command: uniq -c some_file > some_file.out

For example: 1.out:

 1 a
 2 b
 4 c

2.out

 2 b
 8 c

I would like to merge these results, so I get:

 1 a
 4 b
 12 c

I thought that sort or uniq could handle it but I don't see any option related to it. Writing some ruby/perl script is one of way to go but I'd like to do it easly with core *nix commands (like mentioned sort and uniq).

Edit: To be clear. I don't have original files and I have to merge *.out files.

Thanks for help!

I guess there should be a solution involving only join, awk and expr. — Pascal Cuoq, Sep 25 '09 at 09:39

score 5 · Accepted Answer · answered Sep 25 '09 at 09:56

5

Try it with awk:

awk '{ count[$2] += $1 } END { for(elem in count) print count[elem], elem }' 1.out 2.out

answered Sep 25 '09 at 09:56

Philipp

48,066
12
84
109

Ok, it should work for me. It's not ideal because I expect to do it with O(N) memory usage, where N is number of files but it will work for some time (unless I have big results). Thanks! – radarek Sep 25 '09 at 10:12
I don't think it's linear in the number of files because `awk` reads all files in sequence, one line at a time, and it only needs to keep the `count` array (hash table?) in memory. – Philipp Sep 25 '09 at 11:02
I didn't say that solution given by Philipp is linear. I said that it can be written such a solution. – radarek Sep 25 '09 at 11:05

score 0 · Answer 2 · answered Sep 25 '09 at 12:24

0

It's quite a specific problem, so it's unlikely any tool will do this by default. You can script it in a small enough loop (no need for awk nastyness), implemented in any scripting language (even sh). I don't think there's another way.

answered Sep 25 '09 at 12:24

wds

31,873
11
59
84

score 0 · Answer 3 · answered Sep 26 '09 at 10:55

0

This is not quite serious (but it works). I like Philipps solution.

cat 1.out 2.out |
{
    while read line; do
        for i in $(seq ${line% *}); do
            echo ${line#* }
        done
    done
} | sort | uniq -c

answered Sep 26 '09 at 10:55

andre-r

2,685
19
23

score 0 · Answer 4 · answered Jun 20 '19 at 09:20

The accepted answer works for the specific values provided in the question. If the output of uniq -c contains more spaces than just the one between the count and the value however, the following awk script does not truncate output after the second field:

awk '{ cnt=$1; $1=""; count[substr($0, 2)] += cnt } END { for(elem in count) print count[elem], elem }' 1.out 2.out

Merge results from uniq -c

4 Answers4