5

Objective is to call a device function available in another file, when i compile the global kernel it shows the following error *External calls are not supported (found non-inlined call to _Z6GoldenSectionCUDA)*.

Problematic Code (not the full code but where the problem arises), cat norm.h

# ifndef NORM_H_
# define NORM_H_
# include<stdio.h>

__device__ double invcdf(double prob, double mean, double stddev);

#endif

cat norm.cu

# include <norm.h>

__device__ double invcdf(double prob, double mean, double stddev) {
    return (mean + stddev*normcdfinv(prob));
       }

cat test.cu

# include <norm.h>
# include <curand.h>
# include <curand_kernel.h>

__global__ void phase2Kernel(double* out_profit, struct strategyHolder* strategy) {
       curandState seedValue;
       curand_init(threadIdx.x, 0, 0, &seedValue);
       double randomD = invcdf(curand_uniform_double( &seedValue ), 300, 80);
    }

nvcc -c norm.cu -o norm.o -I"."
nvcc -c test.cu -o test.o -I"."

Itachi
  • 1,383
  • 11
  • 22
  • 2
    Please post an example to reproduce the problem (see [here](http://sscce.org) for guidance), your code works for me. – Tom Jan 02 '14 at 14:29
  • 2
    In the real code (as opposed to what you have shown here), are you explicitly calling a constructor or are you relying on the default constructor of the class? – talonmies Jan 02 '14 at 14:48
  • 1
    Could you please specify nvcc arguments that you use? – geek Jan 02 '14 at 17:16
  • Posted the real code. I am using Makefile to compile, the command i used is __nvcc -c file.cu -o file.o__ – Itachi Jan 03 '14 at 06:36
  • @Bala: The code you just posted compiles without error. Vote to close. If you can't provide code which reproduces the problem, we cannot help you. – talonmies Jan 03 '14 at 06:45
  • What you just edited also compiles for me after I add the correct include for curand_kernel.h. Nothing I have tried can make any of the code you have posted fail to compile.... Are you really saying that if you cut and paste this code into a new file and compile it, that it fails with the error you say it does? – talonmies Jan 03 '14 at 07:02
  • @Tom it compiles fine when i put all the functions in same file (say test.cu) but it throws the above error when i put them in different files (say kernel.cu, norm.h ) and include their headers to form final executable – Itachi Jan 03 '14 at 07:16
  • 1
    @Bala I cannot figure out why putting part of the above code in a separate `.h` file and then including it in the `.cu` file should lead to compilation errors. Could you please edit your question with the exact file partition reproducing the error? – Vitality Jan 03 '14 at 08:23
  • @Jack Added the exact code. Also tried compiling with -arch=sm_20 as suggested in [here](http://stackoverflow.com/questions/5994005/cuda-external-calls-not-supported?rq=1) but no success – Itachi Jan 03 '14 at 09:17
  • @Bala Your code still compiles fine to me. From your compilation line, you are not using the `-rdc=true` option to generate relocatable code. Please, note that you are in the separate compilation framework. – Vitality Jan 03 '14 at 09:35
  • OK so now we have a *completely* different question than there was before... – talonmies Jan 03 '14 at 11:03

1 Answers1

7

You're trying to do separate compilation, which needs some special command line options. See the NVCC manual for details, but here's how to get your example to compile. I've targeted sm_20, but you can target sm_20 or later depending on what GPU you have. Separate compilation is not possible on older devices (sm_1x).

  • You don't need to declare the __device__ function as extern in your header file, but if you have any static device variables they will need to be declared as extern
  • Generate relocatable code for the device by compiling as shown below (-dc is the device equivalent of -c, see the manual for more information)

    nvcc -arch=sm_20 -dc norm.cu -o norm.o -I.
    nvcc -arch=sm_20 -dc test.cu -o test.o -I.
    
  • Link the device parts of the code by calling nvlink before the final host link

    nvlink -arch=sm_20 norm.o test.o -o final.o
    
Tom
  • 20,852
  • 4
  • 42
  • 54
  • Is it necessary to declare the `__device__` function as `extern`? I'm compiling the above code with the `-rdc=true` option without `extern`. Is there perhaps something I don't see? – Vitality Jan 03 '14 at 10:14
  • @Tom extern is not really needed i guess. It compiles fine with the directions pointed out in the link you have provided. – Itachi Jan 03 '14 at 10:39
  • 2
    `extern` is definitely required for statically defined symbols to prevent duplication errors during linking, but a forward declaration should be sufficient for a function. – talonmies Jan 03 '14 at 11:07
  • 1
    As talonmies said, you don't need `extern` on the function declaration, I was too hasty. I've modified the answer to reflect this for future visitors. – Tom Jan 03 '14 at 12:13