1

I've implemented a software for searching a pattern inside an image. With cvMatchTemplate the execution time is around 10ms (because I'm taking a pattern of 40x40 in a search window of 120x160 pixels. The image is 640x480 so I'm not considering the whole image).

I've implemented the same algorithm by using the gpu::MatchTemplate, and I was expecting improvements for the execution time. It is taking 220ms to compute the score.

what is happening?

Thanks.

EDIT: I measured the loading time of the images and it takes 1ms to perform the ".upload" function because the images are already uncompressed.

Isn't the same algorithm?

EDIT: I wrote the code using CUDA and my own kernel: it performs the FFT using the cuda functions on the images, and the whole execution of the algorithm is less than 2 ms with 1024x1024 images and a pattern of 200x200. I used the thread_sync in order to measure the exec. time.

  • Even with FFT, less than 2ms is surprising result for a 1024*1024 and 200*200 template. Did you use simple fft ( see `convolveDft` method at http://docs.opencv.org/modules/core/doc/operations_on_arrays.html#dft) ? I assume you will have had to compute fft of 200*200 template at same size of image ( 1024*1024 ). Or did you go for tile based correlation to take advantage of the smaller sized template ? – kiranpradeep Apr 16 '15 at 17:21
  • I used cufftExecR2C to execute the FFT on the entire image, so your assumption was right. the FFT execution time is very very small, less than 1 ms (it depends on the graphic card used of course). Anyway, i wrote my own kernel to execute the correlation (FFT mul, correlation coeff computation, search for the pattern). I used opencv just to take the images. – Cesare Mercurio Apr 17 '15 at 17:27

1 Answers1

0

I think it is very much dependant on your GPU processing power, some gpu's cannot perform better than cpu's. See this question gpuvscpu

Community
  • 1
  • 1
Samer
  • 1,923
  • 3
  • 34
  • 54
  • Actually I'm testing the same code on two different GPUs: GTX 660m and GT640. And I have the same execution time (more or less) – Cesare Mercurio Oct 20 '14 at 16:17