Questions tagged [pycuda]

PyCUDA is the Python module which provides a comprehensive pythonic interface to the NVIDIA CUDA GPU computing environment.

PyCUDA provides a python module to access the NVIDIA CUDA driver API from within Python code.

The module includes interoperability with numpy, and comprehensive metaprogramming facilities for dynamically generating and JIT compiling CUDA code using Python.

417 questions
3
votes
1 answer

How do I pass a 2-dimensional array into a kernel in pycuda?

I found an answer here, but it is not clear if I should reshape the array. Do I need to reshape the 2d array into 1d before passing it to pycuda kernel?
Pippi
  • 2,451
  • 8
  • 39
  • 59
3
votes
1 answer

Cuda/PyCuda - Large matrix traversal and block/grid size

I am working on something that has highlighted the fact I don't have a firm grasp of how blocks and grids work in cuda. I have a 1000x10 matrix that I would like to traverse and fill in each element with a value. The kernel is like this: __global__…
user1489497
  • 127
  • 9
3
votes
2 answers

Efficient method to check for matrix stability in CUDA

A number of algorithms iterate until a certain convergence criterion is reached (e.g. stability of a particular matrix). In many cases, one CUDA kernel must be launched per iteration. My question is: how then does one efficiently and accurately…
user2398029
  • 6,699
  • 8
  • 48
  • 80
3
votes
1 answer

How do I feed a 2-dimensional array into a kernel with pycuda?

I have created a numpy array of float32s with shape (64, 128), and I want to send it to the GPU. How do I do that? What arguments should my kernel function accept? float** myArray? I have tried directly sending the array as it is to the GPU, but…
dangerChihuahua007
  • 20,299
  • 35
  • 117
  • 206
3
votes
1 answer

How should I interpret this CUDA error?

I am teaching myself CUDA with pyCUDA. In this exercise, I want to send over a simply array of 1024 floats to the GPU and store it in shared memory. As I specify below in my arguments, I run this kernel on just a single block with 1024…
dangerChihuahua007
  • 20,299
  • 35
  • 117
  • 206
3
votes
1 answer

operator overloading in Cuda

I successfully created an operator+ between two float4 by doing : __device__ float4 operator+(float4 a, float4 b) { // ... } However, if in addition, I want to have an operator+ for uchar4, by doing the same thing with uchar4, i get the following…
nbonneel
  • 3,286
  • 4
  • 29
  • 39
2
votes
1 answer

How to tell PyCUDA to reuse the memory from an earlier kernel?

My program has two kernels and the second kernel should use the already uploaded input data and the results from the first kernel, so I can save the memory transfers. How would I archive this? This is how I launch my kernels: result =…
Framester
  • 33,341
  • 51
  • 130
  • 192
2
votes
1 answer

Testing combinations of multiple arrays with Cuda

I have the below code written in php and have been reading up on Cuda to utilize the GPU processing power of my old Geforce 8800 Ultra. How do I convert this nested combinations test to Cuda parallel processing code (if even possible...)? The…
teknikol
  • 85
  • 5
2
votes
1 answer

pycuda ,cuda -- some questions and a simple code that gives me error "identifier "N" is undefined "

i am trying to learn pycuda and i have a few questions that i am trying to understand. I think my main question is how to communicate between pycuda and a function inside a cuda file. So,if I have a C++ file (cuda file) and in there i have some…
George
  • 5,808
  • 15
  • 83
  • 160
2
votes
3 answers

Cuda demoting double to float error despite no doubles in code

I'm writing a kernel using PyCUDA. My GPU device only supports compute capability 1.1 (arch sm_11) and so I can only use floats in my code. I've taken great effort to ensure I'm doing everything with floats, but despite that, there is a particular…
ely
  • 74,674
  • 34
  • 147
  • 228
2
votes
0 answers

exchange gpu data from python (pycuda gpuarray) to opencv (cv::cuda::GpuMat) and vice versa (duplicate)

Since nobody answered this question I'm trying again. Is it possible to exchange data between Pycuda and OpenCV Cuda module? Pycuda has its own class Pycuda GPUArray and OpenCV has its own Gpu_Mat. The plan is to perform some kind of action on the…
DomagojM
  • 101
  • 1
  • 3
2
votes
0 answers

Getting "nvcc preprocessing ... failed error" - Pycuda

I am trying to use PyCuda right now. I followed the tutorial on the official page and this code is working perfectly in my environment. import pycuda.driver as cuda import pycuda.autoinit from pycuda.compiler import SourceModule import numpy a =…
Tuna Yüce
  • 141
  • 4
2
votes
1 answer

how Create program using binary in CUDA?

I had code in OpenCL where I use clCreateProgramWithBinary() to create the program from binary. I am porting this application to CUDA and I don't find any similar function. Can someone help me with how I can create the program from binary or…
2
votes
1 answer

Release memory for Pycuda

How do I release memory after a Pycuda function call? For example in below, how do I release memory used by a_gpu so then I will have enough memory to be assigned to b_gpu instead of having the error as below? I tried importing from pycuda.tools…
Henry
  • 57
  • 6
2
votes
2 answers

How to profile PyCuda code with the Visual Profiler?

When I create a new session and tell the Visual Profiler to launch my python/pycuda scripts I get following error message: Execution run #1 of program '' failed, exit code: 255 These are my preferences: Launch: python…
Framester
  • 33,341
  • 51
  • 130
  • 192