Questions tagged [nvrtc]

NVIDIA's run-time compilation library for CUDA source code, which produces PTX intermediate-language code

The CUDA platform supports run-time compilation (similar to that of OpenCL): Your application binary can load program source code from a file (or generate it dynamically) and compile it into the PTX intermediate format. This can then be linked into gpu-executable binary code using the CUDA driver API.

A more in-depth description and complete examples can be found in the nVIDIA Documentation for NVRTC.

20 questions
0
votes
1 answer

Serializing a CUfunction object

Is it possible to serialize a CUfunction object generated by NVRTC and save it on a non-volatile memory (disk, SSD, etc.) so that it can be used again later without having to go through the JIT compilation process?
Farzad
  • 3,288
  • 2
  • 29
  • 53
0
votes
0 answers

Is there a list of headers that can be used in an string to compile with NVRTC?

(Using NVRTC run-time compiler) There is a string of CUDA function: R"( extern "C" __global__ void test1(float * a, float * b, float *c) { int id= blockIdx.x * blockDim.x + threadIdx.x; c[id]=a[id]+b[id]; …
huseyin tugrul buyukisik
  • 11,469
  • 4
  • 45
  • 97
0
votes
1 answer

Is NVRTC unavailable for Win32?

I'm running Python27 x32 and getting this error: Could not load "nvrtc64_75.dll": %1 is not a valid Win32 application. I've also tried with cuda8. As I realized, NVRTC docs list x64 as a requirement: NVRTC requires the following system…
n611x007
  • 8,952
  • 8
  • 59
  • 102
0
votes
1 answer

cuModuleGetFunction returns not found

I want to compile CUDA kernels with the nvrtc JIT compiler to improve the performance of my application (so I have an increased amount of instruction fetches but I am saving multiple array accesses). The functions looks e.g. like this and is…
Jens
  • 2,592
  • 2
  • 21
  • 41
-1
votes
0 answers

operators for BF16 floating-point values & who defines __CUDA_NO_BFLOAT16_OPERATORS__?

I'm experiencing inconsistent behavior w.r.t the availability of bfloat16 operators when compiling kernel code with NVRTC, on different machines - but with the same CUDA version, 11.2 (when including cuda_bf16.h) On one machine, this operator from…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
1
2