1

For these days I was working on C-mex code in order to improve speed in DBSCAN matlab code. In fact, at the moment I finished a DBSCAN on C-mex. But instead, it takes more time (14.64 seconds in matlab, 53.39 seconds in C-Mex) with my test data which is a matrix 3 x 14414. I think this is due to the use of mxRealloc function in several parts of my code. Would be great that someone give me some suggestion with the aim to get better results.

Here is the code DBSCAN1.c:

https://www.dropbox.com/sh/mxn757a2qmniy06/PmromUQCbO

Shai
  • 111,146
  • 38
  • 238
  • 371
mrDataos
  • 21
  • 3
  • 1
    Try: http://codereview.stackexchange.com/ – slayton Jan 29 '13 at 20:49
  • From a quick look, you do a lot of unnecessary allocations. Also consider to support arbitrary distance measures, not just Euclidean distance on double vectors. DBSCAN can be made very flexible, which will make it much more useful. Consider supporting indexes, too! – Has QUIT--Anony-Mousse Jan 31 '13 at 07:45

1 Answers1

2

Using mxRealloc in every iteration of a loop is indeed a performance killer. You can use vector or similar class instead. Dynamic allocation is not needed at all in your distance function.

If your goal is not to implement DBSCAN as a mex but to speed it up, I will offer you a different solution. I don't know which Matlab implementation are you using, but you won't make a trivial n^2 implementation much faster by just rewriting it to C in the same way. Most of the time is spent calculating the nearest neighbors which won't be faster in C than it is in Matlab. DBSCAN can run in nlogn time by using an index structure to get the nearest neighbors.

For my application, I am using this implementation of dbscan, but I have changed the calculation of nearest neighbors to use a KD-tree (available here). The speedup was sufficient for my application and no reimplementation was required. I think this will be faster than any n^2 c implementation no matter how good you write it.

Shai
  • 111,146
  • 38
  • 238
  • 371
Josef Borkovec
  • 1,069
  • 8
  • 13