Questions tagged [pycuda]

PyCUDA is the Python module which provides a comprehensive pythonic interface to the NVIDIA CUDA GPU computing environment.

PyCUDA provides a python module to access the NVIDIA CUDA driver API from within Python code.

The module includes interoperability with numpy, and comprehensive metaprogramming facilities for dynamically generating and JIT compiling CUDA code using Python.

417 questions

votes

2 answers

Generating single random number in pyCuda kernel

I have seen many ways to generate an array of random numbers. but I want to generate a single random number. Is there any function as rand() in c++. I don't want a series of random numbers. I just need to generate a random number inside the kernel.…

asked Jun 29 '21 at 09:21

Saddam

votes

1 answer

_driver.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN5boost6detail12set_tss_dataEPKvPFvPFvPvES3_ES5_S3_b

I tried to run the Nvidia TensoRT's python samples, but got an error importing pycuda: ImportError: .../pycuda-2020.1-py3.6-linux-x86_64.egg/pycuda/_driver.cpython-36m-x86_64-linux-gnu.so: undefined symbol:…

python linux pycuda

asked May 05 '21 at 05:49

Alex Fu

votes

1 answer

Python multiprocessing with TensorRT

I am trying to use a TensorRT engine for inference in a python class that inherits from multiprocessing. The engine works in a standalone python script on my system, but now while integrating it into the codebase, the multiprocessing used in the…

python multiprocessing python-multiprocessing pycuda tensorrt

asked Apr 06 '21 at 14:27

a-doering

1,149
10
21

votes

1 answer

PyCUDA: GPUArray.get() returns inaccessible array

I am trying to sum up an array in the GPU and then obtain it back on the host. For this, I am using the pycuda.gpuarray.sum() function. import pycuda.gpuarray a = np.array([1,2,3,4,5]) b = gpuarray.to_gpu(a) c = gpuarray.sum(b) c = c.get() print(c) …

python cuda pycuda

asked Dec 10 '20 at 12:09

m0bi5

8,900
7
33
44

votes

2 answers

CUDA/PyCUDA: Which GPU is running X11?

In a Linux system with multiple GPUs, how can you determine which GPU is running X11 and which is completely free to run CUDA kernels? In a system that has a low powered GPU to run X11 and a higher powered GPU to run kernels, this can be determined…

linux cuda x11 pycuda

asked Jun 21 '11 at 16:45

dwelch91

votes

1 answer

Can't install pycuda with pip

I am trying to install the PyCUDA module to run some python script I downloaded, but trying to install it with pip doesn't work. I run pip install pycuda on the command line At first, I get this: Collecting pycuda Using cached…

python pip pycuda

asked Oct 09 '20 at 19:24

Julio974

votes

1 answer

TensorRT multiple Threads

I am trying to use TensorRt using the python API. I am trying to use it in multiple threads where the Cuda context is used with all the threads (everything works fine in a single thread). I am using docker with tensorrt:20.06-py3 image, and an onnx…

multithreading cuda pycuda tensorrt nvidia-docker

asked Jul 03 '20 at 16:20

Walid Hanafy

1,429
2
14
26

votes

1 answer

Can int variables be transferred from host to device in PyCUDA?

import pycuda.driver as cuda import pycuda.autoinit from pycuda.compiler import SourceModule import numpy as np dims=img_in.shape rows=dims[0] columns=dims[1] channels=dims[2] #To be used in CUDA Device …

python cuda pycuda

asked Jan 21 '20 at 10:35

Saswat K. Levin

votes

2 answers

PyCUDA GPUArray slice-based operations

The PyCUDA documentation is a bit light on examples for those of us in the 'Non-Guru' class, but I'm wondering about the operations available for array operations on gpuarrays, ie. if I wanted to gpuarray this…

python multidimensional-array cuda gpu pycuda

asked Apr 18 '11 at 20:11

Bolster

7,460
13
61
96

votes

2 answers

pycuda shared memory up to device hard limit

This is an extension of the discussion here: pycuda shared memory error "pycuda._driver.LogicError: cuLaunchKernel failed: invalid value" Is there a method in pycuda that is equivalent to the following C++ API call? #define SHARED_SIZE 0x18000 // 96…

pycuda

asked Jun 24 '19 at 10:29

dag

votes

1 answer

Using CUDA types in pyCUDA

Let us consider the CUDA code at CUDA's Mersenne Twister for an arbitrary number of threads and suppose that I want to convert it to a pyCUDA application. I know that I can use ctypes and CDLL, namely, cudart =…

python cuda pycuda

asked Apr 02 '19 at 07:06

Vitality

20,705
4
108
146

votes

0 answers

What is the source code for Keras function model.fit()?

I'm building a simple neural network in Python using Tensorflow and Keras. I need to implement this code to work on a GPU, using PyCuda. I plan on parallelizing learning all the epochs, however since Keras is very minimalistic, all epoch training…

python tensorflow keras pycuda

asked Jan 19 '19 at 12:40

jonny

votes

2 answers

pycuda.driver module not found

I have installed python 3.7.2 along with the following libraries: jupyter, pandas, numpy, pytools and pycuda. I'm working with Visual Studio Code. I'm trying to run the standard pyCuda example: # --- PyCuda initialization import pycuda.driver as…

python cuda pycuda

asked Jan 17 '19 at 15:29

Vitality

20,705
4
108
146

votes

1 answer

How to profile PyCuda code in Linux?

I have a simple (tested) pycuda app and am trying to profile it. I've tried NVidia's Compute Visual Profiler, which runs the program 11 times, then emits this error: NV_Warning: Ignoring the invalid profiler config option:…

python profiling cuda gpgpu pycuda

asked Mar 15 '11 at 20:32

Jeff Guy

votes

1 answer

PyCuda mem_alloc initialization error

in desaturate_image redarray_gpu = cuda.mem_alloc(self.redarray.nbytes) pycuda._driver.LogicError: cuMemAlloc failed: initialization error I get the above error on this line: redarray_gpu = cuda.mem_alloc(self.redarray.nbytes) What could be…

python arrays numpy pycharm pycuda

asked Jun 25 '18 at 19:05

Sachin Titus

1,960
3
23
41

Prev 1 2 3

…

27 28 Next