How to calculate the number of occurrences of each element in the 100000 vectors using Matlab?

Question

For 100000 vectors containing 40 different numbers between 1 and 100, how to calculate the number of occurrences of each element in the 100000 vectors.

example:

A = [2 5 6 9]; B = [3 6 9 1]

the result should be if the numbers are between 1 and 10: [1 2 3 4 5 6 7 8 9 10, 1 1 1 0 1 2 0 0 2 0]

If you have 100000 individually-named vectors, you have a problem. You should maintain them all in a single 2D array. — Oliver Charlesworth, Jul 06 '14 at 19:03
Please tell me that these vectors are all contained in the same variable? — rayryeng, Jul 06 '14 at 19:06

score 2 · Answer 1 · answered Jul 06 '14 at 19:39

2

It seems like you want to compute the histogram of all values.
Use hist command for that

n = hist( A(:), 1:100 );

answered Jul 06 '14 at 19:39

Shai

111,146
38
238
371

score 1 · Answer 2 · answered Jul 06 '14 at 21:42

Assuming that you have a variable A that stores all of these vectors (like in Shai's assumption), another alternative to hist is to use accumarray. This should automatically figure out the right amount of bins you have without specifying them like in hist. Try:

n = accumarray(A(:), 1);

score 1 · Answer 3 · edited May 23 '17 at 12:29

You can also use the sparse function to do the counting:

% 100000x1 vector of integers in the range [1,100]
A = randi([1 100], [100000 1]);

% 100x1 array of counts
n = full(sparse(A, 1, 1, 100, 1));

As others have shown, this should give the same result as:

n = histc(A, 1:100);

or:

n = accumarray(A, 1, [100 1]);

(note that I explicitly specify the size in the sparse and accumarray calls. That's because if for a particular vector A values didn't go all the way up to 100, then the counts array n will be shorter than 100 in length).

All three methods are in fact mentioned in the tips section of the accumarray doc page, which is the most flexible of all three.

The behavior of accumarray is similar to that of the histc function. Both functions group data into bins.

histc groups continuous values into a 1-D range using bin edges.

accumarray groups data using n-dimensional subscripts.

histc returns the bin counts using @sum.

accumarray can apply any function to the bins.

You can mimic the behavior of histc using accumarray with val = 1.

The sparse function also has accumulation behavior similar to that of accumarray.

sparse groups data into bins using 2-D subscripts, whereas accumarray groups data into bins using n-dimensional subscripts.

sparse adds elements that have identical subscripts into the output. accumarray adds elements that have identical subscripts into the output by default, but can optionally apply any function to the bins.

How to calculate the number of occurrences of each element in the 100000 vectors using Matlab?

3 Answers3

Linked