Questions tagged [unified-memory]
34 questions
19
votes
2 answers
Spark execution memory monitoring
What I want is to be able to monitor Spark execution memory as opposed to storage memory available in SparkUI. I mean, execution memory NOT executor memory.
By execution memory I mean:
This region is used for buffering intermediate data when…

astro_asz
- 2,278
- 3
- 15
- 31
5
votes
1 answer
Can we copy a "normal" GPU memory to a "unified" memory?
We have two GPU memories, one is allocated with cuMalloc as normal device memory, the other is allocated with cuMallocManaged as unified memory. Is it possible to copy between them? and if we use driver API, what direction should I use?
float*…

Xiang Zhang
- 2,831
- 20
- 40
4
votes
1 answer
Behavior and performance of unified memory vs pinned host memory
I am a student who is currently working on a project that consists of writing a certain program in the CUDA. I believe the subject of this program, is irrelevant for the question; but I have to mention that my professor, suggested me to use unified…

PatrykB
- 1,579
- 1
- 15
- 24
3
votes
1 answer
CUDA why just reading (zero write) from unified memory cause next kernel to become slower
#include
#include
#include
#include
#include
using namespace std;
class MyTimer {
std::chrono::time_point start;
public:
void startCounter() {
start =…

Huy Le
- 1,439
- 4
- 19
3
votes
1 answer
cuda unified memory leak
I was writing a program which does some basic object detection with cuda.
I ran into a problem where I allocate unified memory with cudaMallocManaged, do some processing with it and then free it with cudaFree. Event though, cudaFree never returned…

thebear8
- 194
- 2
- 11
3
votes
1 answer
CUDA unified memory and Windows 10
While using CudaMallocManaged() to allocate an array of structs with arrays inside, I'm getting the error "out of memory" even though I have enough free memory. Here's some code that replicates my problem:
#include
#include…

Julian
- 33
- 3
3
votes
1 answer
Do I need provide Gpu context when creating unified memory?
Question 1)
When I call CUDA driver API, usually I need first push the context (which represents a GPU runtime) to current thread. For normal cuMalloc, the memory will be allocated on that GPU specified by the context. But if I try to call…

Xiang Zhang
- 2,831
- 20
- 40
2
votes
3 answers
Overcoming the copy overhead in CUDA
I want to parallelize an image operation on the GPU using CUDA, using a thread for each pixel (or group of pixels) of an image. The operation is quite simple: each pixel is multiplied for a value.
However, if I understand it correctly, in order to…

Sean
- 99
- 9
2
votes
1 answer
Can CUDA unified memory be written to by another CPU thread?
I am writing a program that retrieves images from a camera and processes them with CUDA. In order to gain the best performance, I'm passing a CUDA unified memory buffer to the image acquisition library, which writes to the buffer in another…

Elektito
- 3,863
- 8
- 42
- 72
2
votes
1 answer
Cuda Unified memory vs cudaMalloc
I am trying to do some benchmarking to ensure using CUDA's Unified Memory(UM) approach will not hurt us wrt performance.
I am performing an FFT. One way i use UM, one way i use the cudaMalloc
I compare the results afterwards and they all match up…

AAG
- 123
- 7
2
votes
1 answer
GPU memory oversubscription with mapped memory, Unified Virtual Addressing and Unified Memory
I'm considering possibilities to process data on a GPU, that is too big for the GPU memory, and I have a few questions.
If I understand that correctly, with mapped memory the data resides in the main memory and is transferred to the GPU only when…

lawful_neutral
- 633
- 8
- 29
1
vote
1 answer
Cuda unified memory: Program gets different result when use pointer or non-pointer object as class member
Recently, I learnt how to code using the cuda unified memory.
But what weird is that the kernel reports different result when I replace the pointer object by the non-pointer one.
Please refer to the Core.cuh and main.cu.
The ClassManaged.h is base…

Mangoccc
- 41
- 7
1
vote
1 answer
Unexpected read access violation error in CUDA when working with unified memory
I have got an object say d_obj that has some members on the unified memory and some members explicitly on the device memory. I then call a CUDA kernel that takes the object and works with it. I would like to immediately have CPU do some stuff with…

If_You_Say_So
- 1,195
- 1
- 10
- 25
1
vote
1 answer
cudaMallocManaged (unified memory) with cuBLAS
I am trying to use Unified Memory with cudaMallocManaged() with the cuBLAS library. I am performing a simple matrix to vector multiplication as a simple example, and storing the result in an array results. However when printing the results array, I…

mrwhynot243
- 41
- 5
1
vote
1 answer
OpenACC: Deep Copy and Unified Memory
I would like to understand clearly a situations I faced often accelerating an application with OpenACC. Let's say I have this loop:
#pragma acc parallel loop collapse(4)
for (k = KBEG; k <= KEND; k++){
for (j = JBEG; j <= JEND; j++){
for (i = IBEG;…

Steve
- 89
- 1
- 6