1

As you know, Ptr<Filter> cv::cuda::createMedianFilter (int srcType, int windowSize, int partition=128) function added to OpenCV3.1.0.

I'm trying to do a median filter on 8 bit large images (6000*6000) with custom window size(up to 21). I compare cv::medianBlur and cv::cuda::createMedianFilter and results was

windowSize    cv::medianBlur    cv::cuda::createMedianFilter
    3             0.071 sec         3.637 sec
    5             0.285 sec         3.679 sec
    11            2.641 sec         3.652 sec
    19            2.566 sec         3.719 sec

1) why cuda::createMedianFilter is slower than cv::medianBlur?

2) How can i write a kernel code to implement median filter that use opencv Mat with custom kernel size?

talonmies
  • 70,661
  • 34
  • 192
  • 269
AmiR Hossein
  • 163
  • 1
  • 11

2 Answers2

1

The speed of the the convolution operation mainly depends on the size of the filter kernel when is image size is constant. Considering sorting is more complicated than summation, median filter will cost longer time.

To go down to low level to implement your own CUDA convolution function with customized filter kernel, you need to get the raw pointer of your image data

MyConv(char* image, int width, int height, int stride)

and then writing CUDA code.

Here's a tutorial on cuda convolution.

http://igm.univ-mlv.fr/~biri/Enseignement/MII2/Donnees/convolutionSeparable.pdf

This question also gives an example.

cuda convolution mapping

Community
  • 1
  • 1
kangshiyin
  • 9,681
  • 1
  • 17
  • 29
  • Thank kangshiyin, in both OpenCV functions (medianBlur and createMedianFilter) we can set kernelSize. can you clearly answer to second question? – AmiR Hossein Jul 04 '16 at 02:25
  • @amirhossein do you mean "custom kernel" rather than "custom kernel size"? – kangshiyin Jul 04 '16 at 03:45
  • @kangshiyins custom kernel size mean: kernel radius, like 3, 5, 7, 9, 11, 13, 15. custom kernel like kernel with 5*5, that is fill whit '1' like triangle top or triangle bottom or even only third row. for example with kernel with 5*5 and third row fill with '1', only 5 element of 25 kernel elements are '1' – AmiR Hossein Jul 04 '16 at 04:22
  • @amirhossein I know but which one do you want to know in your q2? You wrote "custom kernel size" but I thought it would be "custom kernel" – kangshiyin Jul 04 '16 at 04:25
  • @kangshiyins for a first version, kernel size be 15*15 with circle kernel shape. in this case, 157 pixel fill with '1' – AmiR Hossein Jul 04 '16 at 04:38
  • @amirhossein I see. You may want to amend your question, it looks confusing. – kangshiyin Jul 04 '16 at 04:44
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/116333/discussion-between-amir-hossein-and-kangshiyin). – AmiR Hossein Jul 04 '16 at 04:54
  • The median filter **does not** perform a convolution. It is a non-linear filter where the median of an array is computed. So instead of convolving, the complexity relies on sorting the data. – Michael Gruner Mar 24 '21 at 23:11
1

I also used cuda::createMedianFilter() and found that, there are two GpuMat newly allocated in MedianFilter::apply() everytime calling filter->apply(), and GPU memory allocation is very time consuming, so I move the two Mats into MedianFilter Class to be member vars(do not allocated again unless the images size changes).

Speed up 4X tested with 1000 images (400 * 300). Also, it seems like the parameter partitions could be set to src.rows / 2, which will be faster than the original parameter-128.

The two mat in src code are GpuMat devHist; GpuMat devCoarseHist

wykvictor
  • 11
  • 1