Questions tagged [pycuda]

PyCUDA is the Python module which provides a comprehensive pythonic interface to the NVIDIA CUDA GPU computing environment.

PyCUDA provides a python module to access the NVIDIA CUDA driver API from within Python code.

The module includes interoperability with numpy, and comprehensive metaprogramming facilities for dynamically generating and JIT compiling CUDA code using Python.

417 questions
7
votes
1 answer

Could pycuda and tensorflow work together?

once tensorflow be active. it will make every cuda code crash even I use sess.close()... the error msg is: pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle The following code it a simple example cuda code run by…
Chi-Fang Hsieh
  • 235
  • 1
  • 3
  • 13
7
votes
1 answer

Passing a C++/CUDA class to PyCUDA's SourceModule

I have a class written in C++ that uses also some definitions from cuda_runtime.h, this is a part from opensource project named ADOL-C, you can have a look here! This works when I'm using CUDA-C, but I want somehow to import this class in PyCUDA,…
Banana
  • 1,276
  • 2
  • 16
  • 19
6
votes
2 answers

How do I diagnose a CUDA launch failure due to being out of resources?

I'm getting an out-of-resources error when trying to launch a CUDA kernel (through PyCUDA), and I'm wondering if it's possible to get the system to tell me which resource it is that I'm short on. Obviously the system knows what resource has been…
Eli Stevens
  • 1,447
  • 1
  • 12
  • 21
6
votes
2 answers

Using Pycuda with PySpark - nvcc not found

My environment: I'm using Hortonworks HDP 2.4 with Spark 1.6.1 on a small AWS EC2 cluster of 4 g2.2xlarge instances with Ubuntu 14.04. Each instance has CUDA 7.5, Anaconda Python 3.5, and Pycuda 2016.1.1. in /etc/bash.bashrc I've…
zenlc2000
  • 451
  • 4
  • 9
6
votes
1 answer

pyopencl - pycuda performance difference

Comparing multiple matrix multiplication calculations with pyopencl and pycuda show differences in performance. System: Ubuntu 14.04 with GeForce 920m Pyopencl code: #-*- coding: utf-8 -*- import pyopencl as cl import pyopencl.array from jinja2…
Jesse
  • 370
  • 2
  • 12
6
votes
2 answers

Pycuda Blocks and Grids to work with big datas

I need help to know the size of my blocks and grids. I'm building a python app to perform metric calculations based on scipy as: Euclidean distance, Manhattan, Pearson, Cosine, joined other. The project is PycudaDistances. It seems to work very well…
Vinnicyus Gracindo
  • 162
  • 1
  • 2
  • 6
6
votes
1 answer

Disappointing results in pyCUDA benchmark for distance computing between N points

The following script was set-up for benchmark purposes. It computes the distance between N points using an Euclidean L2 norm. Three different routines are implemented: High-level solution using the scipy.spatial.distance.pdist function. Fairly…
Rakulan S.
  • 303
  • 1
  • 2
  • 9
6
votes
1 answer

pycuda seems nondeterministic

I've got a strange problem with cuda, In the below snippet, #include #define OUTPUT_SIZE 26 typedef $PRECISION REAL; extern "C" { __global__ void test_coeff ( REAL* results ) { int id = blockDim.x *…
user1726633
  • 356
  • 2
  • 10
5
votes
1 answer

`Out of resources` error while doing loop unrolling

When I increase the unrolling from 8 to 9 loops in my kernel, it breaks with an out of resources error. I read in How do I diagnose a CUDA launch failure due to being out of resources? that a mismatch of parameters and an overuse of registers could…
Framester
  • 33,341
  • 51
  • 130
  • 192
5
votes
2 answers

Pycuda messing up numpy matrix transpose

Why does the transposed matrix look differently, when converted to a pycuda.gpuarray? Can you reproduce this? What could cause this? Am I using the wrong approach? Example code from pycuda import gpuarray import pycuda.autoinit import numpy data =…
Framester
  • 33,341
  • 51
  • 130
  • 192
5
votes
0 answers

Why do I get an illegal memory access when I'm calling a kernel in pycuda?

I'm trying to implement a neuron model with Hodgkin and Huxley formalism on my RTX 2080 Ti with PyCuda. The code is quite large so I wont put all of it here. the first part of my class is to set the number of neurons, create all variables in the GPU…
ymmx
  • 4,769
  • 5
  • 32
  • 64
5
votes
1 answer

PyCUDA Passing variable by value to kernel

Should be simple enough; I literally want to send an int to the a SourceModule kernel declaration, where the C function __global__......(int value,.....) with the value being declared and called... value = 256 ... ... func(value,...) But I'm…
Bolster
  • 7,460
  • 13
  • 61
  • 96
5
votes
1 answer

PyCUDA: Pow within device code tries to use std::pow, fails

Question more or less says it all. calling a host function("std::pow ") from a __device__/__global__ function("_calc_psd") is not allowed from my understanding, this should be using the cuda pow function instead, but it isn't.
Bolster
  • 7,460
  • 13
  • 61
  • 96
5
votes
1 answer

PyCUDA: C/C++ includes?

Something that isn't really mentioned anywhere (at least that I can see) is what library functions are exposed to inline CUDA kernels. Specifically I'm doing small / stupid matrix multiplications that don't deserve to be individually offloaded to…
Bolster
  • 7,460
  • 13
  • 61
  • 96
5
votes
0 answers

Anaconda install pycuda

I am trying to install pycuda in computer with Windows 10 64bits, I installed the GPU Toolkit 9.1 and Anaconda 4.2 with python 3.5 64bits. I installed pycuda using the precompiled package: pycuda‑2017.1.1+cuda9185‑cp35‑cp35m‑win_amd64.whl The…
Mauricio Ruiz
  • 322
  • 2
  • 10
1
2
3
27 28