Using external library classes in CUDA project

Question

I am trying to enhance a small C++ project with CUDA. My project is using a custom library's classes and functions for example Matrix3d, Vector3d, Plane2d etc. They are mostly geometric objects.

When I try to use my code in the device (either __host__ __device__ functions or a kernel) all the library functions/objects are considered as host code and I get multiple warnings and errors for example error: identifier "Plane3d::~Plane3d" is undefined in device code

Is there a way to use my library on device as well? How is it done? I don't have experience on CUDA and C++ (I have only used CUDA with simple C code without classes) so I don't get the strategy very well.

Is there a method to avoid changing the library source code? It is possible to change the library's code but it would be better if I could avoid it.

Thanks a lot.

If the library is compiled for CPU usage, then there is no way to use it without making some changes to the source code of the library and rebuilding the library. A library represents code in compiled object form. The compiled object consists of CPU machine code. The GPU cannot use or understand CPU machine code. If you recode the library with appropriate __ device__ __ host__ decorations and perhaps other changes, and pass it through nvcc, that would be the starting point to being able to use it on the device. If the library is common, someone else may have done this already. — Robert Crovella, Dec 04 '12 at 16:21
No the library is internal and I can change it and compile it. It's just that I will have to add the `__device__ __host__` keyword almost everywhere. — George Aprilis, Dec 04 '12 at 16:28

talonmies · Accepted Answer · 2012-12-05T08:45:32.647

1

There is no particular problem with using C++ classes in CUDA. The object model is only slightly different to standard C++.

Any structure or class data members are automatically defined in whichever memory space (so host or device) the class or structure is instantiated in. What is not automatic is the code generation for function members and operators within classes and structure. The programmer must explicitly define and compile those for whichever memory space the object will be instantiated in. This latter requirement means you must have both __device__ and __host__ definitions of each function you call within the object. This includes the constructor and destructor, the latter being the error you show in your question.

edited Dec 05 '12 at 08:45

answered Dec 04 '12 at 18:33

talonmies

70,661
34
192
269

So this means that I can't avoid adding extra definitions to every library class? – George Aprilis Dec 05 '12 at 10:49
@George Aprilis: Basically yes. If the structures/classes have member functions, there has to be an explicitly defined GPU implementation of those member functions in order for the code to compile for the GPU. – talonmies Dec 05 '12 at 12:11
I will try to change the library the way it is described [here](http://stackoverflow.com/a/6978720/1608616) then. Thank you for helping. – George Aprilis Dec 05 '12 at 12:33

score 0 · Answer 2 · answered Dec 04 '12 at 16:23

0

You don't need to change the source code - what you need is to write an adapter.

CUDA kernels work with low level structures e.g. double*, double*, double** or float*, float*, float** as well as with the built in CUDA types.

CUDA can not work directly on memory allocated outside CUDA anyway (only with memory allocated on the Graphics card, not regular RAM), so you will have to copy your data into the graphics memory.

If you provide methods which have access to the buffers used by your types, you can copy them, continuously if your types have continuous memory, or in chunks if not, into the graphics card (using the CUDA memory copy function), then you can process them with kernels as double*** using simple indexing.

answered Dec 04 '12 at 16:23

Danny Varod

17,324
5
69
111

2

CUDA kernels also work with C++ and objects. If you want to have an object represenatation that is usable on the device, you will need to provide an object definition of some sort (i.e. not just double/float/built-in types). You could draw a distinction between the host and the device, but this doesn't obviate the need to have a suitably decorated device representation of the objects *somewhere*. You could write an adapter instead of re-factoring the library, but you'll be doing duplication that way. If you don't want to use objects on the device, then low level data types will do. – Robert Crovella Dec 04 '12 at 16:45
The thing is that because they are basic geometric elements many of my classes use them as components, and I also use various functions (for example there are functions like `Matrix3x3::rotate()` etc). They are not only data. Or maybe I didn't understand well what you mean. – George Aprilis Dec 04 '12 at 16:47
@GeorgeAprilis You are correct. If you want to have access to object methods, simply copying data does not work. You need an object representation that has been compiled to use on the device. – Robert Crovella Dec 04 '12 at 17:00
1

@RobertCrovella I'd delete my answer, but it would be a shame to loose your comments. – Danny Varod Dec 04 '12 at 18:18

Using external library classes in CUDA project

2 Answers2