2

I am using conv2 function in armadillo with image size of 224x224 and mask size of 10x10. For a 3 channel image, I am doing something like:

arma::mat temp(215, 215, fill::zeros);
for (int i = 0; i < 3; i++)
   temp += arma::mat(arma::conv2(image_channel, channel_mask)).submat(9, 9, 222, 222);

I want only valid convolution and hence I am using submat. This code is executed in a loop 32 times with different masks. For 32 iterations it takes 2.37 seconds which is way much slower than octave. Octave can execute the same code in 0.25 seconds.

Both octave and armadillo are set up to use OpenBLAS and I have defined appropriate flags in c++ file. (Eg. ARMA_USE_BLAS etc.). Can anybody please tell me what is the problem here.

Shubham Gupta
  • 93
  • 1
  • 6
  • 1
    Why are you tagging Octave at all in this question? Also, Octave does not use any BLAS to perform convolution. If you do change to use Octave and have performance issues, consider installing the image package and use `fftconv2`. – carandraug Apr 01 '16 at 14:57
  • I bet it would be faster if you did the `.submat` outside the loop. – Svaberg Aug 08 '16 at 14:04

0 Answers0