MATLAB: counting occurrences of rows of A in rows of B

Question

Consider 2 matrices A,B where A is 21x2 matrix and B is 35x3 matrix

 A = [ 1 2 ;                  B = [ 1 2 3 ;
       1 3 ;                        1 2 4 ; 
       1 4 ;                        .
       .                            .
       .                            .
       .                            5 6 7]
       6 7 ]

And I have vectors count_A and count_B:

count_A is a 21 x 1 vector which has a scalar corressponding to each row of A
Similarily, count_B is 35 x 1 vector.

I need to scan through B and find the ratio count_B/count_A which reads the first 2 elements of each row of B and calls its corresponding count from count_A. We also need to get the count_B value for that row and gives the ratio for every row in B.

Example:

 count_A = [ 2 ;                count_B = [ 1 ;
             3 ;                            2 ;
             .                              .
             .                              . 
             .                              .
             2 ]                            3 ]

Outputs should be as follows:

first row: count([1,2,3]) / count([1,2]) which would be 1/2
second row: count([1,2,4]) / count([1,2]) which would be 2/2 = 1.

...
35th row: count([5,6,7]) / count([5,6])

And what I mean by count([1,2,3]) is count_B coressponding to [1,2,3] and count([1,2]) is the value of count_A coressponding to [1,2].

Any ideas?

Not clear. Assume some values for `count_A` and `count_B` and tell us what's the output you expect? — Divakar, Mar 31 '14 at 08:45
If the answers to this question helped you solve your problem, please consider upvoting all answers that were helpful, and mark the best one as accepted (by ticking the little checkmark under the vote count). — Tomas Aschan, Apr 02 '14 at 13:10

Luis Mendo · Answer 1 · 2014-03-31T10:55:17.707

Your question is not clear. The following does what its title asks.

countA = sum(squeeze(all(bsxfun(@eq, B(:,1:2).', permute(A, [2 3 1])))));
countB = sum(squeeze(all(bsxfun(@eq, A.', permute(B(:,1:2), [2 3 1])))));

Example:

A = [ 1 2                  
      1 3                 
      1 4 
      6 7 ];

B = [ 1 2 3
      1 2 4
      1 3 5
      5 6 7 ];

countA =
     2     1     0     0

countB =
     1     1     1     0

Tomas Aschan · Answer 2 · 2014-03-31T09:24:48.910

0

I'm not entirely sure what you're asking, so let me start with a few assumptions:

You have in your matrices A and B a number of rows with 2 and 3 elements respectively. Each row is unique, so for example [2, 3] will appear at most once in A.
When you say count([1, 2]), what you really mean is the value in count_A which is on the same row as [1, 2] is in A.

You can easily accomplish this with some smart search tricks:

A = [1 2; 1 3; 1 4; 6 7];
B = [1 2 3; 1 2 4; 1 3 5; 6 7 8];

count_A = [2 3 2 5]';
count_B = [1 2 3 4]';

ratio = zeros(size(count_B));

for Bidx = 1:size(B,1)
    % See http://stackoverflow.com/questions/6209904/find-given-row-in-a-matrix
    [~, Aidx] = ismember(B(Bidx, 1:2), A, 'rows');

    if Aidx > 0 % ismember returns 0 if not found
        ratio(Bidx) = count_A(Aidx) / count_B(Bidx);
    end
end

ratio

Now, my output is

Is this what you were asking for?

Note: I can't vouch for the performance of ismember, so if you have a very large number of rows, this might be real slow. Test for correctness on a small sample if you need to.

edited Mar 31 '14 at 09:24

answered Mar 31 '14 at 09:16

Tomas Aschan

58,548
56
243
402

Hey Tomas. Your assumptions were right. I'm sorry if I was unclear. This works except for the fact that the no. of rows of A and B might be different and consequently the no. of rows of count_A and count_B will differ. I guess I can just pad A with some zeros. – enigmae Mar 31 '14 at 09:32
Also, I'm dealing with a large data set. Any substitute for ismember here? – enigmae Mar 31 '14 at 09:32
@Nishanth: Actually, there is no requirement on `count_A` and `count_B` to have the same number of rows here, only that `count_A` has *at least* as many rows as `A`, and the same for `count_B` and `B`. However, for each row in `B` that doesn't start with something matching a row in `A` you'll get a `0` in `ratios`. – Tomas Aschan Mar 31 '14 at 09:35
Regarding performance, I'm not saying `ismember` is *slow* - I'm just saying that *I don't know* if it's fast enough for your purposes. It's probably one of the fastest ways to search for a row in a matrix, so if it's not fast enough you might have to re-think the algorithm, and perhaps see if you can restructure your data to a representation in which it's easier to formulate a performant algorithm for your task. – Tomas Aschan Mar 31 '14 at 09:37
But before you worry about performance: try it out! Run it for a smaller data set and time it with `tic/toc` to see if it's reasonable, and/or run it for your entire data set once, and hit Ctrl+C in the console if you grow tired of waiting. Don't throw `ismember` out the window just because some dude on stack overflow can't guarantee that it's fast enough ;) – Tomas Aschan Mar 31 '14 at 09:38
@Nishanth Yes, please make sure the results are as expected! :) First priority has to be that. – Divakar Mar 31 '14 at 09:40

Divakar · Answer 3 · 2014-03-31T09:41:36.523

Check this bsxfun approach -

A = [1 2; 1 3; 1 4; 6 7;5 6]
B = [1 2 3; 1 2 4; 1 3 5; 6 7 8;6 7 4; 6 7 7;5 6 7]
count_A = [2 3 2 5 2]';
count_B = [1 2 3 4 3 5 7]';

matches = bsxfun(@eq,A,permute(B(:,1:2),[3 2 1]))
[x1,y1] = find(reshape(all(matches,2),size(A,1),size(B,1)))
out = count_B./count_A(x1)

Output -

Hope this is what you were after.

MATLAB: counting occurrences of rows of A in rows of B

3 Answers3