1

So I am attempting to use the CUDA Runtime API with Go's cgo on Windows. I've been at this for a few days now and am stuck: I am getting an undefined reference to my kernel wrapper.

I have separated out my kernel and it's wrapper into the following

FILE: cGo.cuh

typedef unsigned long int ktype;
typedef unsigned char glob;

/*
function Prototypes
*/

extern "C" void kernel_kValid(int, int, ktype *, glob *);

__global__ void kValid(ktype *, glob *);

FILE: cGo.cu

#include "cGo.cuh"
#include "device_launch_parameters.h"
#include "cuda.h"
#include "cuda_runtime.h"

//function Definitions

/*
kernel_kValid is a wrapper function for the CUDA Kernel to be called from Go
*/
extern "C" void kernel_kValid(int blocks, int threads, ktype *kInfo, glob *values) {
    kValid<<<blocks, threads>>>(kInfo, values);//execute the kernel
}


/*
kValid is the CUDA Kernel which is to be executed
*/
__global__ void kValid(ktype *kInfo, glob *values) {
    //lots of code
}

I compile my CUDA source code into a shared library as such:

nvcc -shared -o myLib.so cGo.cu

then I have created a header file to include in my cgo

FILE: cGo.h

typedef unsigned long int ktype;
typedef unsigned char glob;

/*
function Declarations
*/

void kernel_kValid(int , int , ktype *, glob *);

Then from the go package I utilize cgo to call my kernel wrapper I have

package cuda
/*
#cgo LDFLAGS: -LC:/Storage/Cuda/lib/x64 -lcudart //this is the Cuda library
#cgo LDFLAGS: -L${SRCDIR}/lib -lmyLib //this is my shared library
#cgo CPPFLAGS: -IC:/Storage/Cuda/include //this contains cuda headers
#cgo CPPFLAGS: -I${SRCDIR}/include //this contains cGo.h

#include <cuda_runtime.h>
#include <stdlib.h>
#include "cGo.h"
*/
import "C"

func useKernel(){
//other code
C.kernel_kValid(C.int(B), C.int(T), unsafe.Pointer(storageDevice), unsafe.Pointer(globDevice))
cudaErr, err = C.cudaDeviceSynchronize()
//rest of the code
}

So all of the calls to the CUDA runtime API don't throw errors, it's only my kernel wrapper. This is the output when I build the cuda package with go.

C:\Users\user\Documents\Repos\go\cuda_wrapper>go build cuda_wrapper\cuda
# cuda_wrapper/cuda
In file included from C:/Storage/Cuda/include/host_defines.h:50:0,
                 from C:/Storage/Cuda/include/device_types.h:53,
                 from C:/Storage/Cuda/include/builtin_types.h:56,
                 from C:/Storage/Cuda/include/cuda_runtime.h:86,
                 from C:\Go\workspace\src\cuda_wrapper\cuda\cuda.go:12:
C:/Storage/Cuda/include/crt/host_defines.h:84:0: warning: "__cdecl" redefined
 #define __cdecl

<built-in>: note: this is the location of the previous definition
# cuda_wrapper/cuda
C:\Users\user\AppData\Local\Temp\go-build038297194\cuda_wrapper\cuda\_obj\cuda.cgo2.o: In function `_cgo_440ebb0a3e25_Cfunc_kernel_kValid':
/tmp/go-build\cuda_wrapper\cuda\_obj/cgo-gcc-prolog:306: undefined reference to `kernel_kValid'
collect2.exe: error: ld returned 1 exit status

It's here I'm not really sure what's wrong. I have been looking at questions asked about undefined references with cgo but nothing I have found has solved my issue. I have also been looking at the fact that the CUDA runtime API is written in C++ and if that would affect how cgo will compile this but again I haven't found anything conclusive. At this point I think I have confused myself more than anything else so I'm hoping someone more knowledgeable can point me in the right direction.

  • This looks like a symbol mangling problem. The CUDA code is compiled with C++ linkage by default. What does golang expect? – talonmies Mar 01 '18 at 06:58
  • Could you elaborate what you mean by symbol mangling problem? To answer your question though, If I understand [the docs](https://golang.org/cmd/cgo/) correctly than cgo expects either C or C++ depending on included files & my flags. I have updated my post to reflect using c++ flags but I get the same error. Unfortunately cgo can't directly compile cuda source files & so I am trying to implement this [solution on here](https://stackoverflow.com/questions/32589153/how-to-compile-cuda-source-with-go-languages-cgo). – SilverEnsign99 Mar 01 '18 at 08:38
  • http://fitzgeraldnick.com/2017/02/22/cpp-demangle.html . – talonmies Mar 01 '18 at 10:28

1 Answers1

2

Good catch on the name manlging.

Here's a solution we used for gorgonia:

#include <math.h>

#ifdef __cplusplus
extern "C" {
#endif


__global__ void sigmoid32(float* A, int size)
{
    int blockId = blockIdx.x + blockIdx.y * gridDim.x + gridDim.x * gridDim.y * blockIdx.z;
    int idx = blockId * (blockDim.x * blockDim.y * blockDim.z) + (threadIdx.z * (blockDim.x * blockDim.y)) + (threadIdx.y * blockDim.x) + threadIdx.x;
    if (idx >= size) {
        return;
    }
    A[idx] = 1 / (1 + powf((float)(M_E), (-1 * A[idx])));
}

#ifdef __cplusplus
}
#endif

So... just wrap your kernel wrapper function in extern "C"

talonmies
  • 70,661
  • 34
  • 192
  • 269
Chewxy
  • 168
  • 6
  • Note it isn't a kernel in the original question, it is a host wrapper for a kernel function. But the solution is the same. – talonmies Mar 01 '18 at 13:45
  • I am failing to understand something; as @talonmies points to, NVCC "compiles with C++ linkage by default". cGo by default compiles with gcc, I can see this when calling `go build -x`. gcc is called twice. The first time with -I flags, second time with -L flags. on the second gcc call I get: ...\_obj\cuda.cgo2.o: In function `_cgo_1d856146b359_Cfunc_kernel_kValid': ...\_obj/cgo-gcc- prolog:306: undefined reference to `kernel_kValid' In both header & cu file I include the extern "c" ifdef – SilverEnsign99 Mar 01 '18 at 18:21