Questions tagged [pycuda]

PyCUDA is the Python module which provides a comprehensive pythonic interface to the NVIDIA CUDA GPU computing environment.

PyCUDA provides a python module to access the NVIDIA CUDA driver API from within Python code.

The module includes interoperability with numpy, and comprehensive metaprogramming facilities for dynamically generating and JIT compiling CUDA code using Python.

417 questions
-1
votes
1 answer

Why an empty cuda kernel takes more time than a opencv operation on CPU?

I have read the argument that sometimes implementing things with CUDA on the GPU takes more time than doing it with the CPU because of: The time to allocate device memory The time to transfer to and back to that memory alright, so I have written a…
KansaiRobot
  • 7,564
  • 11
  • 71
  • 150
-1
votes
1 answer

Why the thread is the same with multiple threads in PyCUDA

I have the following program import pycuda.driver as cuda import pycuda.autoinit from pycuda.compiler import SourceModule mod = SourceModule(""" #include __global__ void myfirst_kernel() { printf("I am in block…
KansaiRobot
  • 7,564
  • 11
  • 71
  • 150
-1
votes
2 answers

Any suggestions when it shows " TypeError: not enough arguments for format string " in Python?

when I try to run an example of Matrix multiplication by pycuda. kernel_code_template = """ __global__ void MatrixMulKernel(float *a,float *b,float *c){ int tx = threadIdx.x; int ty = threadIdx.y; float Pvalue = 0; for(int i=0;…
Evelyn
  • 11
  • 5
-1
votes
1 answer

Why PyCUDA is faster than C CUDA in this example

I am exploring to move from OpenCL to CUDA, and did a few tests to benchmark the speed of CUDA in various implementations. To my surprise, in the examples below, the PyCUDA implementation is about 20% faster than the C CUDA example. I read many…
w.tian
  • 21
  • 6
-1
votes
1 answer

Understanding in details the algorithm for inversion of a high number of 3x3 matrixes

I make following this original post : PyCuda code to invert a high number of 3x3 matrixes. The code suggested as an answer is : $ cat t14.py import numpy as np import pycuda.driver as cuda from pycuda.compiler import SourceModule import…
user1773603
-1
votes
1 answer

Adapt existing code and Kernel code to perform a high number of 3x3 matrix inversion

Following a previous question ( Performing high number of 4x4 matrix inversion - PyCuda ), considering the inversion of 4x4 matrix, I would like to do the same but with 3x3 matrix. As @Robert Crovella said, this change implies a complete…
user1773603
-1
votes
1 answer

Inconsistent results in cuda GPU accelerated code

I was trying to compute Local Binary Patterns for a image on my GPU, utilising cuda module in python for the same. But the results produced by execution of similar algorithm on CPU and GPU is producing different results. Can you help me figure out…
JVJ
  • 9
  • 5
-1
votes
1 answer

Using Pycuda Multiple Threads

I'm trying to run multiple threads on GPUs using the Pycuda example MultipleThreads. When I run my python file, I get the following error message: (/root/anaconda3/) root@109c7b117fd7:~/pycuda# python multiplethreads.py Exception in thread…
Zhangsheng
  • 85
  • 2
  • 11
-1
votes
1 answer

Pycuda CompileError with Anaconda on Windows

I've just started looking into Cuda and especially PyCuda. I'm currently using Anaconda on Windows 7. I have installed Pycuda using the Anaconda Prompt and tried the following code, which I copied directly from the PyCuda documentation web page.…
Moritz90
  • 63
  • 3
-1
votes
2 answers

Rank of each element in a matrix row using CUDA

Is there any way to find the rank of element in a matrix row separately using CUDA or any functions for the same provided by NVidia?
-1
votes
1 answer

CUDA histogram2d not working

Due to a seeming lack of a decent 2D histogram for CUDA (that I can find... pointers welcome), I'm trying to implement it myself with pyCUDA. Here's what the histogram should look like (using Numpy): Here's what I've got so far: code =…
scnerd
  • 5,836
  • 2
  • 21
  • 36
-1
votes
1 answer

Image processing in pycuda

I have designed a web site using Struts2. Now I have to call a function where image processing will be done. For that I have chosen to use pycuda. Can any one tell me steps and dependencies for installing pycuda. (I have to call this code from an…
Aadya
  • 75
  • 10
-1
votes
1 answer

PyCUDA demo example error on OSX 10.9.2 + CUDA 5.5 + EDP 2.7.3

I'm getting a pycuda runtime error (very similar to the one at https://stackoverflow.com/questions/20078191/opencv-2-4-7-mac-osx-10-9-python-2-7-6-cuda-5-5) as below. The error when executing the example is cordelia:examples xxx$ python…
jtlz2
  • 7,700
  • 9
  • 64
  • 114
-1
votes
1 answer

PyCUDA strange error cuLaunchKernel failed: invalid value

When I try to use the script underneath to get the data back to the cpu, there is an error. I don't get an error when I try to put some values in "ref" if I would just put: ref[1] = 255; ref[0] = 255; ref[2] = 255; but if I do something like…
-2
votes
0 answers

CUDA GPU memory allocation - is there a way of dynamically allocating memory within a loop?

I am refactoring a piece of code for a project, the aim is to reduce the runtime as much as possible, and I am using PyCuda to run a big loop on a GPU. The kernel needs to follow this basic logic: for pixel in pixels: (this is the input array and…
LimeFire
  • 9
  • 3
1 2 3
27
28