5

I have a WPF application that acquires images from a camera, processes these images, and displays them. The processing part has become burdensome for the CPU, so I've looked at moving this processing to the GPU and running custom CUDA kernels against them. The basic process is as follows:

1) acquire image from camera 2) load image onto GPU 3) call CUDA kernel to process image 4) display processed image

A WPF-to-CUDA-to-Display Control strategy is what I'm trying to figure out. It seems natural that once the image is loaded onto the GPU that it would not have to be unloaded in order to be displayed. I've read that this can be done with OpenGL, but do I really need to learn OpenGL and include it in my project in order to do a fast display of a CUDA-processed image?

I understand (I think) the issues of calling CUDA kernels from C#. My plan is to either build an unmanaged library around my CUDA calls, which I later wrap for C# -- OR -- try to decide on which one of the managed wrappers (managedCUDA, Cudafy, etc.) to try. I worry about using one of the prebuilt wrappers because they all appear to be lightly supported...but maybe I have the wrong impression.

Anyway, I'm feeling a bit overwhelmed after days of researching the possible options. Any advice would be greatly appreciated.

Bryan Greenway
  • 703
  • 11
  • 30

2 Answers2

2

The process of taking a result of CUDA computation and using it directly on the device for a graphics activity is called "interop". There is OpenGL "interop" and there is DirectX "interop". There are plenty of CUDA sample codes demonstrating how to interact with computed images.

To go directly from computed data on the device, to display, without a trip to the host, you will need to use one of these 2 APIs (OpenGL or DirectX).

You mentioned two of the managed interfaces I've heard of, so it seems like you're aware of the options there.

If the processing time is significant compared to (much larger than) the time taken to transfer the image from host to device, you might consider starting out by just transferring the image from host to device, processing it, and then transferring it back, where you can then use the same plumbing you have been using to display it. You can then decide if the additional effort for interop is worth it.

If you can profile your code to figure out how long the image processing takes on the host, and then prototype something on the device to find out how much faster it is, that will be instructive.

You may find that the processing time is so long you can even benefit from the double-copy arrangement. Or you may find the processing time is so short on the host (compared to just the cost to transfer to the device) that the CUDA acceleration would not be useful.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • 1
    Thanks! You've educated me regarding interop. Now just need to decide on a way to get to DirectX from WPF. Looks like another open source toolset decision. These things make me nervous. Candidates appear to be SlimDX and SharpDX. – Bryan Greenway Mar 06 '14 at 23:00
2

WPF has a control named D3DImage to directly show DirectX content on screen and in the managedCuda samples package you can find a version of the original fluids sample from Cuda Toolkit using it (together with SlimDX). You don’t have to use managedCuda to realize Cuda in C#, but you can take it to see how things can be realized: managedCuda samples

kunzmi
  • 1,024
  • 1
  • 6
  • 8
  • good advice from both you and Robert Crovella. I started looking at DirectX (thinking that this would have better integration with Windows compared to OpenGL) and find that there is another decision to make regarding how to get WPF and DirectX working together. You mentioned SlimDX. I also found SharpDX (a spinoff from SlimDX). Both seem to have moderate to low activity. Just very frustrating trying to make a wise development decision...where to invest my time and effort. Ugh!!!! – Bryan Greenway Mar 06 '14 at 22:58
  • As managedCuda is my project, I can tell you that a wrapper library simply can’t have much more activity then the wrapped API itself. Once the code is written there’s not much more to do if you want to keep things simple. The Cuda core library is so small that you can easily build your own wrapper from the existing code. Whereas DirectX is a huge API, it’s also quiet old: DirectX 9 is more than 10 years old and the wrapper code is written. There’s nothing more to add to SlimDX/SharpDX, why I wouldn’t worry about the activity of these projects: they are both mature and used in many projects. – kunzmi Mar 07 '14 at 00:21
  • I appreciate the wisdom of you comments and I hope that you took no offense in the naivety of mine. I have in fact started following up on your suggestions related to the D3DImage control. For me, it seems like such overkill to adopt Direct3D when all I need is display CUDA manipulated image. From your suggestion, I have been able to drop a D3DImage control on a WPF form (albeit I haven't done anything with it yet). If that turns out to be enough to display my images and act as an interop "receiver" for me, then maybe managedCUDA will get me the rest of the way? – Bryan Greenway Mar 07 '14 at 04:53