2

I'm trying to assign ~1 Million values to a 100x100 logical matrix like this:

CC(Labels,LabelsXplusOne) = true;

where CC is 100x100 logical and Labels, LabelsXplusOne are 1024x768 int32.

The problem now is the above statement takes about as long as 5 minutes to complete on a modern CPU. Obviously it is badly implemented in MATLAB, so how can we make the above run faster without resorting to loops?

In case you are wondering, i need this statement to compute blobs in a integer (not binary) image.

And also:

max(max(Labels)) = 100
max(max(LabelsXplusOne)) = 100

EDIT: Ok i got it. Maybe this will help others in the future:

tic; CC(sub2ind(size(CC),Labels,LabelsXplusOne)) = true; toc;
Elapsed time is 0.026414 seconds.

Much better now.

Hugo Maxwell
  • 723
  • 5
  • 13
  • 2
    From what I've learned during my time with MATLAB, trying to beat its implementation is a losing fight. What you *can* do is rethink the problem in such a way that the statement is not needed or reduced in scope. As such, I understand you've given a general problem statement, but a little more detail regarding the problem at hand/surrounding code would probably help others help you – im so confused Oct 24 '12 at 15:32
  • 3
    also there are little to no examples that come to mind of core matlab functions that are *badly* implemented - in fact they are highly optimized for exactly the type of work you're doing here – im so confused Oct 24 '12 at 15:34
  • Basically i need a function like 'bwlabel' which works with integer images, not binary. – Hugo Maxwell Oct 24 '12 at 15:39
  • regarding your 2nd comment: doing the above in C/C++ will probably run in a few milliseconds. It's just a few million integer operations. – Hugo Maxwell Oct 24 '12 at 15:42
  • good call on vectorizing your matrix, that's a great trick that is underused (esp by me!) – im so confused Oct 24 '12 at 15:58
  • I didn't see that you already found your mistake before I posted my answer. However, I gave a little more explanation and an additional tweak that may speed things up further for you. – gnovice Oct 24 '12 at 16:07
  • 1
    @AK4749 Of course I don't know what exactly you mean by 'core matlab functions', but in my experience it is very very often beneficial performance-wise to rethink what matlab does. Not to talk too much about it, [things like the lightning toolbox](http://research.microsoft.com/en-us/um/people/minka/software/lightspeed/) exist for a reason. – angainor Oct 24 '12 at 16:09
  • @angainor After looking at your link, I think it doesn't deter me from saying the core functions (ie the base operators such as indexing, max function, etc that @user was attempting) are very tight. However, I was very surprised to find `repmat` in that toolbox you provided. Extremely interesting, and you can be sure I'll look a little harder when I next try to optimize matlab, thank you! – im so confused Oct 24 '12 at 16:14
  • 1
    @AK4749 thats why I asked about *core functionality*. More examples, if you deal with sparse matrices, you may want to look at SuiteSparse by Tim Davis. `sparse2` function, just to name one function in the package, is much faster than MATLABs. – angainor Oct 24 '12 at 16:19
  • @AK4749 Have a look at [this SO question](http://stackoverflow.com/questions/13382155/is-indexing-vectors-in-matlab-inefficient) - it deals with the inefficiency of vector (and likely matrix) indexing in MATLAB. I think there is a lot of things you can do better (performance-wise) than MATLAB. – angainor Nov 15 '12 at 09:05

1 Answers1

2

There are a couple of issues that jump out at me...

  1. I have the feeling you are doing the matrix indexing wrong. As it stands now, what will happen is every value in Labels will be paired with every value in LabelsXplusOne, producing (1024*768)^2 total index pairs for your rows and columns of CC. That's likely what's taking so long.

    What you probably want is to only use each pair of values as indices, like Labels(1,1),LabelsXplusOne(1,1), Labels(1,2),LabelsXplusOne(1,2), etc. To do this, you should convert your indices into linear indices using the function SUB2IND.

  2. Additionally, your matrix CC only contains 10,000 entries, yet your index matrices each contain 786,432 integer values. This means you will end up assigning the value true to the same entry in CC many times over. You should first remove redundant sets of indices using the function UNIQUE, then use them to assign values to CC.

This is what I think you want:

CC(unique(sub2ind(size(CC), Labels, LabelsXplusOne))) = true;
gnovice
  • 125,304
  • 15
  • 256
  • 359
  • Oh yeah, i forgot. You're right it was a (1024*768)^2 assignment, that explains a lot. Using sub2ind did the trick. Although i think using unique will be slower than doing multiple assignements, since it has to sort the values. – Hugo Maxwell Oct 24 '12 at 16:07