1

I have written an image processing application using Visual C++ forms and OpenCV on a windows machine. Everything seems to work ok, but displaying the images is very slow - only a few fps. I would like to be able to get to 30 or so. I am currently using the standard imshow(...) followed by waitkey(1).

My question is: Is there a better (i.e. faster) way to get an image from memory to the monitor. The Mat structure used by openCV is essentially a fancy header pointing to a contiguous block of unsigned char values.

Edit: I tested my code with the VS2013 profiler and it claims that I am spending 50% of the execution time in imshow/waitkey.

I've seen several discussions on this in the OpenCV Q/A forum and they always end with "you shouldn't be using imshow except for debugging" but nobody is suggesting anything else to use, so I thought I'd try here.

guy

Guy Garty
  • 21
  • 1
  • 5
  • How did you reach the conclusion that `imshow` and `waitKey` are the bottleneck? Did you profile your code? Or at least time it? Are you testing it in release build? – Dan Mašek Apr 05 '16 at 22:45
  • 1
    Running [this little test](http://pastebin.com/PFPgzQn0) gives me about 55 FPS with a 1600x1200 image on an i7-4930K with a decent nVidia card. If you want to draw the image yourself, the fastest way is BitBlt in OnPaint handler. I don't use Windows Forms, but i've [got a snippet](http://pastebin.com/GkDQD3AV) I use with WTL, which is quite close to the windows API. Shouldn't be too hard to modify it to suit your purpose. – Dan Mašek Apr 05 '16 at 23:20
  • Release mode helps a bit vs. debug (Should have thought of that myself), – Guy Garty Apr 06 '16 at 19:18
  • I might be the processing algorithm itself where it spends most of the time. What size are the images? What does the algorithm look like? – Dan Mašek Apr 06 '16 at 19:26
  • Images are 5.5MP 16 bit, coming from an sCMOS camera. – Guy Garty Apr 06 '16 at 19:28
  • the processing is scaling them down to 8 bit for display. The VS profiler claims that the display bit is ~50%, the processing ~30% and the image acquisition ~15% (but depends on exposure). So at best I could probably get a 2x gain. The thing is that the software that comes with the camera supports video rates in "live view mode" which I am trying to replicate, so that I know a solution exists, I just cant get to it. – Guy Garty Apr 06 '16 at 19:44
  • Do you resize the image somehow? Is the window autosized, or is the size different than the image size? Can you put your code in the question, or maybe on github, or pastebin? – Dan Mašek Apr 06 '16 at 20:11

1 Answers1

3

Without seeing what you have, here is the approach I would take to achieve what you want.

  1. Have a dedicated thread for frame acquisition from the camera. Insert the acquired frames into a synchronized queue, that is consumed by:

  2. Image processing thread. Takes frames from the queue, processes them into images suitable for display. It changes a synchronized output image, and notifies GUI about it.

  3. Main (GUI) thread is only dedicated to display. When it is notified of an image update, it swaps the synchronized output image with its current working image. (To avoid copying and extra allocations, we just reuse those two image buffers.) Then it invalidates the window. In a WM_PAINT handler, it then displays the image using BitBlt.

Some notes:

  • Minimize allocation/deallocation of buffers. For acquisition, you could have a pre-allocated pool of buffers to cycle through.
  • Prepare the output images in format and size that suit display.
  • Keep track of the number of frames in the queue and set some upper limit. Define an algorithm for dropping excess frames, so that you don't run out of memory and don't lag too much.
  • If you just want to ditch the sleep in waitKey and want something simpler, have a look at this question
  • Instrument your code -- add timing of the crucial parts using high resolution timer. Log them, and/or keep statistics, history.
Community
  • 1
  • 1
Dan Mašek
  • 17,852
  • 6
  • 57
  • 85
  • I have a camera api which requires a callback function. The callback method has the format void* callback(uint16_t*). Thus the memory allocation is happening through the manufacturer, How can i implement now a buffer for me to work with? would a memcpy to another buffer waste cpu time? – Harsh M May 16 '23 at 12:34
  • @HarshM You should ask a new question about that, and provide details about the camera (brand and model) and the API you're using. – Dan Mašek May 16 '23 at 14:09
  • https://stackoverflow.com/questions/76265414/how-to-get-fast-frame-rate-from-a-camera-with-a-callback-api – Harsh M May 16 '23 at 16:47