I have a general question about how to design my application. I have read the Cuda document, but still don't know what I should look into. Really appreciate it if someone could shed a light on it.
I want to do some real time analytics about stocks, say 100 stocks. And I have real time market data feed which will stream with updated market price. What I want to do are:
pre-allocate memory black for each stock on the cuda card, and keep the memory during the day time.
when new data coming in, directly update the corresponding memory on Cuda card.
After updating, it issue signal or trigger event to start analytical calculation.
When calculation is done, write the result back to CPU memory.
Here are my questions:
what's the most efficient way to stream data from CPU memory to GPU memory? Because I want it in real time, so copying memory snapshot from CPU to GPU every second is not acceptable.
I may need to allocate memory block for 100 stocks both on CPU and GPU. How to mapping the CPU memory cell to each GPU memory cell?
How to trigger the analytics calculation when the new data arrive on Cuda card?
I am using a Tesla C1060 with Cuda 3.2 on Windows XP.
Thank you very much for any suggestion.