Is there any way to process the huge bunch of float data as keeping the double precision at CUDA device?

Question

There is a huge bunch of data that is waiting to be processed with Machine Learning algorithm at the CUDA device. However I have some concerns about the memory of the device therefore I try to use float numbers instead of double (I guess it is a good solution unless someone indicates better). Is there any way to keeping double precision for results obtained from float numbers? I guess not. Even this is a a little silly question. So what is the other correct way to handle huge data instance at the device.

I read this through five times and I still don't understand what you are asking. OK, you have a lot of floating point data and you have demoted it to single precision to save memory on the GPU. But then what are you trying to ask? — talonmies, Jun 21 '13 at 12:43

Robert Crovella · Accepted Answer · 2013-06-22T15:09:56.120

No, there's no way to keep double precision in the results if you process the data as float. Handle it as double. If memory size is a problem, the usual approach is to handle the data in chunks. Copy a chunk to the GPU, start the GPU processing, and while the processing is going on, copy more data to the GPU, and copy some of the results back. This is a standard approach to handling problems that "don't fit" in the GPU memory size.

This is called overlap of copy and compute, and you use CUDA streams to accomplish this. The CUDA samples (such as simple multi-copy and compute) have a variety of codes which demonstrate how to use streams.

I don't know what approach Boinc uses. – Robert Crovella Jun 22 '13 at 15:10 — Robert Crovella, Jun 22 '13 at 15:10

score 1 · Answer 2 · answered Jun 21 '13 at 13:43

You can indeed compute double precision results from floating point data. At any point in your calculation you can cast a float value to a double value, and according to standard C type promotion rules from there on all calculations with this value will be in double precision.

This applies as long as you use double precision variables to store the result and don't cast it to any other type. Beware of implicit casts in function calls.

Is there any way to process the huge bunch of float data as keeping the double precision at CUDA device?

2 Answers2