Speeding up Finding the Cumulative Occurrence of Elements

Question

I'm trying to improve the performance of my code. The code basicly calculates the value of L_total (1x2640) and it does it by fetching the data from another variable called L_CN (1320x6). I also have colindexes matrix (2640x3) which stores the values of rows to look at in L_CN.

So, how this goes is, the code looks at colindexes to get the row data. Say colindexes is of the following form:

55    65    75
68    75    85
...

The program will calculate L_total(1) using L_CN(55,1) + L_CN(65,1) + L_CN(75,1). Here the first indices refer to the row numbers obtained from the colindexes matrix. The second indices represent the number of occurrences of these row numbers so far. Therefore, when we calculate L_total(2), it will be L_CN(68,1) + L_CN(75,2) + L_CN(85,1). Here L_CN(75,2) happened because L_CN(75,1) was used before.

To calculate the entire L_total matrix, the following code works well. It stores the number of occurrences for each index by incrementing the corresponding index in a variable called list(2640x1) and thus calculates L_total. It does so in 0.023715 seconds. (Note that n is 2640 below)

for i=1:n
     list(colindexes(i,:)) = list(colindexes(i,:)) + 1;
     L_total(i) = sum(diag(L_CN(colindexes(i,:),list(colindexes(i,:)))));
end

The problem is, I will be running this portion of the code over and over, maybe like a million times. It's a part of a big simulation. Hence, even a smallest portion of increase in the performance is what I am after for. First, I thought getting rid of the for loop would serve for this purpose and switched the code into the following - getting a little help from this topic: Vector of the occurence number:

list_col = reshape(colindexes',1,[]);
occurrence = sum(triu(bsxfun(@eq,list_col,list_col.')));
list = reshape(occurrence,3,[])';
straight_index = colindexes + (list - 1)*k;
L_total = sum(L_CN(straight_index),2)';

This code also does the job for list_col (1x7920), occurrence (1x7920), list (2640x3) and straight_index (2640x3). However, contrary to my expectations, it takes 0.062168 seconds, about three times worse than the for loop implementation. 0.05217 seconds of this operation is due the second line, where the occurrence matrix is formed. With array sizes like mine, it is truly inefficient to find the occurrences like this way.

The question is, with or without the for loop, how can I increase the performance of this code? The vectorization method seems nice, if only I can figure out a way to calculate the occurrence matrix faster. As I said, this portion of code will be run a lot of times, and thus any percent of increase in performance is welcomed.

Thank you!

Further info: colindexes represent a big matrix of size 1320x2640. Instead of storing this entire matrix, I only store the row locations of '1's in this matrix in colindexes. The rest is zero. So the colindexes I specified in the question means, there is a '1' in 1st col 55th row and 2nd col 85th row... So the min,max range is 1,1320. There are only 3 '1's in each column, so that's why its size is 2640x3. This is, of course, background information, for how it is formed. If that helps, the number of occurences for each value in colindexes is also the same, which is 6.

So, for a matrix A = [1 0 0 1; 0 1 1 0], the colindexes is [1; 2; 2; 1].

It's alright, it's a huge code and I'm posting a portion of it. I may have skippped specifying a variable or two. n is 2640, I'll edit that too. — etua, Mar 16 '14 at 15:42
Could you post typical values, and maybe min and max values of `L_CN`, `list` and `colindexes`? — Divakar, Mar 16 '14 at 15:44
`colindexes` represent a big matrix of size 1320x2640. Instead of storing this entire matrix, I only store the row locations of '1's in this matrix in `colindexes`. The rest is zero. So the `colindexes` I specified in the question means, there is a '1' in 1st col 55th row and 2nd col 85th row... So the min,max range is 1,1320. There are only 3 '1's in each column, so that's why its size is 2640x3. This is, of course, background information, for how it is formed. If that helps, the number of occurences for each value in `colindexes` is also the same, which is 6. — etua, Mar 16 '14 at 15:49
Could you run this inside that `for` loop - `std(diag(L_CN(colindexes(i,:),list(colindexes(i,:)))))` and tell us if you are getting any non-zero or any value that is not very small (of the order of E-16)? — Divakar, Mar 16 '14 at 15:55
Not right away, but after some iteration as L_CN keeps getting updated, bunch of NaN values appear if I replace sum with std. I don't see the point you're getting at with standard deviation though. Can you explain more? — etua, Mar 16 '14 at 16:05
You don't need to replace the the `sum` with `std`, just add this `std` calculation above or below the line where `L_total` is being calculated. I tried with some rand numbers and the `diag` values are all same, so I thought we can avoid summing up. — Divakar, Mar 16 '14 at 16:10
Oh I see. Well I get std values around 2.35 - 1.2 as well. So they're not that small. — etua, Mar 16 '14 at 16:31
As far as I can see, if you used an actual `sparse` matrix instead of your own implementation, it would be as simple as `sum(data, 1);` — Notlikethat, Mar 16 '14 at 19:04
Oh, I didn't know MATLAB had that implementation already. I'll look into it. — etua, Mar 16 '14 at 20:08

score 0 · Answer 1 · edited Jun 06 '19 at 17:14

0

If you're still using the for loop, consider storing colindexes(i,:) in a variable. You're using it 4 times. It might save you time marginally.

edited Jun 06 '19 at 17:14

double-beep

5,031
17
33
41

answered Jun 06 '19 at 17:10

user51578

1
1

Speeding up Finding the Cumulative Occurrence of Elements

1 Answers1