Questions tagged [nvcc]

"nvcc" is NVIDIA's LLVM-based C/C++ compiler for targeting GPUs with CUDA.

This tag refers to NVIDIA’s compiler toolchain nvcc for the parallel computing architecture (CUDA). Documentation for nvcc is included with the CUDA Toolkit.

You should ask questions about CUDA here on Stack Overflow, but if you have bugs to report you should discuss them on the CUDA forums or report them via the registered developer portal. You may want to cross-link to any discussion here on Stack Overflow.

688 questions
3
votes
1 answer

How to add an alternative compiler to XCode 5

I am attempting to parallelize simulation code I am using for my thesis with CUDA/Thrust. CUDA/Thrust require use of the nvcc compiler. The C++ code the simulation is written in is kept in an XCode project, and my research group uses XCode feature…
LucidLunatic
  • 153
  • 6
3
votes
0 answers

Nested templated class: unable to match function definition to an existing declaration

Using MSVC++ 2010 (NVCC (CUDA) compiler), defining a nested templated class member of a non-templated parent class outside its declaration block: class cls { public: template class nest { public: template
mchen
  • 9,808
  • 17
  • 72
  • 125
3
votes
1 answer

CUDA 5.0: CUBIN and CUBLAS_device, compute capability 3.5

I'm trying to compile a kernel that uses dynamic parallelism to run CUBLAS to a cubin file. When I try to compile the code using the command nvcc -cubin -m64 -lcudadevrt -lcublas_device -gencode arch=compute_35,code=sm_35 -o test.cubin -c test.cu I…
Soren
  • 105
  • 1
  • 6
3
votes
2 answers

Template parameter as function specifier and compiler optimization

I have found this very useful post and I`d like to clarify something about the compiler optimizations. Lets say we have this function (same like in the original post): template __global__ void kernel() { switch(action) { case…
stuhlo
  • 1,479
  • 9
  • 17
3
votes
1 answer

How are registers assigned in CUDA compilation

It is said that the number of registers per kernel is important for CUDA optimization, and the upper boundary of this number can be set by "-maxrregcount=N" in nvcc. I could not understand this, because I thought that the number of registers can…
Hailiang Zhang
  • 17,604
  • 23
  • 71
  • 117
3
votes
2 answers

How do we use cuPrintf()?

What do we have to do to use cuPrintf()? (device compute capability 1.2, Ubuntu 12) I couldn't find "cuPrintf.cu" and "cudaPrintf.cuh", so i downloaded their code and include them: #include "cuPrintf.cuh" #include "cuPrintf.cu" By the way this is…
Max
  • 441
  • 2
  • 7
  • 14
3
votes
1 answer

Difference on creating a CUDA context

I've a program that uses three kernels. In order to get the speedups, I was doing a dummy memory copy to create a context as follows: __global__ void warmStart(int* f) { *f = 0; } which is launched before the kernels I want to time as…
pQB
  • 3,077
  • 3
  • 23
  • 49
3
votes
1 answer

CUDA nvcc slow host code

I have a problem using the nvcc compiler. I found out that host code compiled using nvcc 4.2 runs about 5 times slower than the same code compiled using g++ 4.4.6. I am using the NVIDIA SDK Makefile template to compile the code in release…
ECHO001
  • 31
  • 2
3
votes
1 answer

How can I compile a CUDA program for sm_1X AND sm_2X when I have a surface declaration

I am writing a library that uses a surface (to re-sample and write to a texture) for a performance gain: ... surface my_surf2D; //allows writing to a texture ... The target platform GPU has compute capability 2.0 and I can compile my code…
FizxMike
  • 971
  • 1
  • 10
  • 16
2
votes
1 answer

Cuda mixed C project linking

I have a large project in C and i'm trying to integrate some Cuda kernels in it. I'm compiling my c-files with "gcc -c main.c" and my .cu files with "nvcc -c cuda_GMRES.cu" and then I try to link the 2 object files with nvcc: "nvcc -o main.o…
tim_chemeng
  • 119
  • 2
  • 7
2
votes
1 answer

nvcc unknown option -no_pie

After updating CUDA on my mac(Snow Leopard) nvidia's nvcc compiler acting strange, when compiling this: nvcc batched_gemm.cu I get the following compile error, and I have no idea how to fix the problem. ld: unknown option: -no_pie collect2: ld…
Martin Kristiansen
  • 9,875
  • 10
  • 51
  • 83
2
votes
1 answer

Does nvcc use cl.exe to compiler both .cpp and .cu files in windows?

I know nvcc only compiles .cu files, and pass .c or .cpp to designated compilers like gcc, g++, clang, clang++, etc... Here's my problem. Thrust headers in .h file in Windows which uses cl.exe complied just fine. But the same codes in Linux with g++…
S.Y. Kim
  • 73
  • 6
2
votes
0 answers

CUDA NVCC compiles very slowly without "-G" debug flag

This command works fine (with the -G option): nvcc test.cu -o test -Iinclude -I../boost_1_79_0 -std=c++20 --expt-relaxed-constexpr -DNDEBUG -arch=sm_86 -G But if I remove the -G option, the nvcc compiler will take a few minutes to compile. I also…
Karbo Lei
  • 21
  • 1
2
votes
0 answers

Prevent false positives for thread sanitizer in extended lambda implementation

Note: Crossposted I am trying to use gcc's thread sanitizer (-fsanitize=thread) to check for data races in my application. Unfortunately, the output is flooded with what I think are false positives caused by the implementation of the extended…
Lukas Lang
  • 400
  • 2
  • 11
2
votes
1 answer

How to Compile correctly with nvcc in Visual Studio?

Problem Descriptions weird things appeared again when i use Visual Studio for CUDA programming (really hate VS qwq). I remembered compiling and running fine previously, but this time it failed even with an extremely simple program. From the error…
Xlucidator
  • 41
  • 5