Questions tagged [nvcc]

"nvcc" is NVIDIA's LLVM-based C/C++ compiler for targeting GPUs with CUDA.

This tag refers to NVIDIA’s compiler toolchain nvcc for the parallel computing architecture (CUDA). Documentation for nvcc is included with the CUDA Toolkit.

You should ask questions about CUDA here on Stack Overflow, but if you have bugs to report you should discuss them on the CUDA forums or report them via the registered developer portal. You may want to cross-link to any discussion here on Stack Overflow.

688 questions
2
votes
2 answers

Armadillo sizeof(arma::Mat) gives different results between GCC and NVCC

#include #include using namespace std; int main() { arma::Mat a; cout << sizeof(a) << "\n"; return 0; } The above code gives different results when I use NVCC for CUDA. $ g++ -o main test.cu.cpp -O3…
Huy Le
  • 1,439
  • 4
  • 19
2
votes
0 answers

Getting "nvcc preprocessing ... failed error" - Pycuda

I am trying to use PyCuda right now. I followed the tutorial on the official page and this code is working perfectly in my environment. import pycuda.driver as cuda import pycuda.autoinit from pycuda.compiler import SourceModule import numpy a =…
Tuna Yüce
  • 141
  • 4
2
votes
1 answer

How to call “Cuda C” device routine from “Cuda Fortran Kernel/device routine”…?

I am trying to call a device routine in Cuda C from a Kernel in Cuda Fortran, but I am getting a linking error. Can someone please help me resolve the issue…? File1:SortbyKey.f95 MODULE TEST USE CUDAFOR CONTAINS ATTRIBUTES(GLOBAL)…
Nethaji
  • 43
  • 8
2
votes
1 answer

NVCC can't handle nested quotes in MSVC compiler options

I'm using the following configuration: CUDA 11.4, CMake 3.21.3, Visual Studio 16.11.5, Windows 10. I've used CMake to successfully generate a VS2019 solution. Here's the CMakeLists.txt file in its entirety (also found…
Justin
  • 1,881
  • 4
  • 20
  • 40
2
votes
1 answer

What does it mean when a variable "has been demoted" in the PTX?

In the function body of my CUDA kernel, I have a few __shared__ array variables, of a fixed size. When I look at the compiled PTX code (SM 7.5) for one of these arrays, I see a comment saying: // my_kernel(t1 p1, t2 p2)::my_variable has been…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
2
votes
1 answer

How to check the NVCC version with CMake 3.15?

How does one check the version of NVCC, with CMake 3.15 (not 3.17 or later)? I suppose I could write my own module to run it with --version, but is there an easier way to do it?
einpoklum
  • 118,144
  • 57
  • 340
  • 684
2
votes
0 answers

NVCC Compile Error when calling Eigen function in kernel code

I am using Eigen 3.3.9 for my project running on Ubuntu 18.04.4. Everything worked well before I tried to modify my project for CUDA support. I've tried cuda/10.0 and cuda/10.2. My problem occurs when I try to call determinant() for a Eigen Matrix.…
Sorevan
  • 106
  • 1
  • 6
2
votes
1 answer

Link object file to project with gprbuild

Using nvcc I created an object file from my project with the following bash script: nvcc -Xcompiler -std=c99 -dc src/interface.cu src/functions.cu nvcc -dlink interface.o functions.o -o obj/link.o In my obj folder I get a link.o file. I need to…
Louis Etienne
  • 1,302
  • 3
  • 20
  • 37
2
votes
0 answers

I got NVCC error while starting StyleFlow. Does anyone know what NVCC error is?

I was trying to use StyleFlow, based on StyleGAN. I am curruntly stuck with NVCC error, and I have no idea. This is the error message I got: RuntimeError: NVCC returned an error. See below for full command line and output log: …
Kokumi
  • 81
  • 1
  • 1
  • 4
2
votes
1 answer

CUDA half float operations without explicit intrinsics

I am using CUDA 11.2 and I use the __half type to do operations on 16 bit floating point values. I am surprised that the nvcc compiler will not properly invoke fused multiply add instructions when I do: __half a,b,c; ... __half x = a * b +…
Bram
  • 7,440
  • 3
  • 52
  • 94
2
votes
1 answer

How to make CMake use clang for CUDA to support c++17

According to this question, it is possible to use c++17 with cuda by using clang. However, I couldn't find how to setup CMakeLists.txt to accomplish this. I enable c++17 with add_compile_options(-std=c++17) Out of the box with the following …
Rufus
  • 5,111
  • 4
  • 28
  • 45
2
votes
0 answers

Why does NVCC not like default-initialization of std::arrays?

Consider the following program: #include int main() { struct A { bool valid = true; ~A() { valid = false; } }; const std::array a; const A& aa = a.at(0); return…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
2
votes
1 answer

Calling Fortran OpenACC from CUDA file. How to compile with PGI?

I have a CUDA code in which I would like to include external code that consists of Fortran with OpenACC kernels. I have two files with the following content inspired on a discussion on the NVIDIA website. File main.cu is the following: #include…
Chiel
  • 6,006
  • 2
  • 32
  • 57
2
votes
2 answers

Registers and shared memory depending on compiling compute capability?

when I compile with nvcc -arch=sm_13 I get: ptxas info : Used 29 registers, 28+16 bytes smem, 7200 bytes cmem[0], 8 bytes cmem[1] when I use nvcc -arch=sm_20 I get: ptxas info : Used 34 registers, 60 bytes cmem[0], 7200 bytes cmem[2], 4…
tim
  • 9,896
  • 20
  • 81
  • 137
2
votes
2 answers

Can I determine at compile time whether --use_fast_math was set?

I'm writing some CUDA code, and I want it to behave differently based on whether or not --use_fast_math was set or not. And - I want to make that decision at compile time, not at run time. It seems that NVCC does not add or change a preprocessor…
einpoklum
  • 118,144
  • 57
  • 340
  • 684