Highest Voted 'cuda-streams' Questions

1

vote

2 answers

CUDA 4.0 RC - many host threads per one GPU - cudaStreamQuery and cudaStreamSynchronize behaviour

I wrote a code which uses many host (OpenMP) threads per one GPU. Each thread has its own CUDA stream to order it requests. It looks very similar to below code: #pragma omp parallel for num_threads(STREAM_NUMBER) for (int sid = 0; sid <…

cuda openmp cuda-streams

asked Mar 08 '11 at 14:48

kokosing

5,251
5
37
50

1

vote

1 answer

How big is a cudaStream_t?

I have inherited some code that basically does stuff like this: void *stream; cudaStreamCreate((cudaStream_t *)&stream); Looking at targets/x86_64-linux/driver_types.h for CUDA 8, I see: typedef __device_builtin__ struct CUStream_st…

c++ cuda portability void-pointers cuda-streams

asked Mar 05 '18 at 03:04

Ken Y-N

14,644
21
71
114

1

vote

2 answers

Why do cudaMemcpyAsync and kernel launches block even with an asynchronous stream?

Consider the following program for enqueueing some work on a non-blocking GPU stream: #include using clock_value_t = long long; __device__ void gpu_sleep(clock_value_t sleep_cycles) { clock_value_t start = clock64(); …

asynchronous cuda cuda-streams

asked Oct 26 '17 at 17:54

einpoklum

118,144
57
340
684

1

vote

1 answer

CUDA streams are blocking despite Async

I'm working on a video stream in real time that I try to process with a GeForce GTX 960M. (Windows 10, VS 2013, CUDA 8.0) Each frame has to be captured, lightly blured, and whenever I can, I need to do some hard-work calculations on the 10 latest…

asynchronous cuda blocking cuda-streams

asked Dec 22 '16 at 15:52

Charlie Echo

87
1
5

1

vote

1 answer

CUDA FFT plan reuse across multiple 'overlapped' CUDA Stream launches

I'm in trying to improve the performance of my code using asynchronous memory transfer overlapped with GPU computation. Formerly I had a code where I created an FFT plan, and then make use of it multiple times. In such situation the time invested…

cuda cufft cuda-streams

asked Mar 04 '15 at 13:33

Omar Valerio

43
7

1

vote

1 answer

The behavior of stream 0 (default) and other streams

In CUDA, how is stream 0 related to other streams? Does stream 0 (default stream) execute concurrently with other streams in a context or not? Considering the following example: cudaMemcpy(Dst, Src, sizeof(float)*datasize,…

cuda gpu nvidia cuda-streams

asked Aug 26 '13 at 11:38

user2188453

1,105
1
12
26

1

vote

1 answer

Global Memory and CUDA streams

I'm working on CUDA and I have a doubt about global memory and streams CUDA. Let: __device__ float Aux[32]; __global__ void kernel1(...) { [...] Aux[threadIdx.y] = 0; [...] } So, if I run this kernel on different streams GPU. Is Aux the…

memory cuda cuda-streams

asked Feb 26 '13 at 17:22

userCUDA

11
2

1

vote

1 answer

Stream scheduling order

The way I see both Process One & Process Two (below), are equivalent in that they take the same amount of time. Am I wrong? allOfData_A= data_A1 + data_A2 allOfData_B= data_B1 + data_B2 allOFData_C= data_C1 + data_C2 Data_C is the output of the…

cuda cuda-streams

asked Feb 12 '13 at 16:53

Doug

2,783
6
33
37

0

votes

0 answers

The cudaMemcpyAsync interaction with pageable host memory

I am beginning to learn cuda programming. In learning the streams and the async/sync features, I have encountered some problems. As said in the Nvidia docs and many sources, the cudaMemcpyAsync can be used to realize the overlapping of data transfer…

asynchronous cuda cuda-streams

asked Aug 16 '23 at 08:40

CabbageHuge

1
1

0

votes

0 answers

Gstreamer create custom CUDA plugin

Want to implement a custom plugin which process only the GPU frames (memory:CUDAMemory) and also update the frame (Consider creating an overlay on the video). $./gst-launch-1.0 videotestsrc ! cudaupload ! 'video/x-raw(memory:CUDAMemory)' !…

c++ gpu gstreamer nvidia cuda-streams

asked Apr 06 '23 at 14:08

Pankaj Buddhe

1
2

0

votes

1 answer

What does CU_MEMPOOL_ATTR_REUSE_ALLOW_OPPORTUNISTIC actually allow?

One of the attributes of CUDA memory pools is CU_MEMPOOL_ATTR_REUSE_ALLOW_OPPORTUNISTIC, described in the doxygen as follows: Allow reuse of already completed frees when there is no dependency between the free and allocation. If a free (a…

memory-management cuda memory-pool cuda-streams cuda-driver

asked Mar 19 '23 at 22:36

einpoklum

118,144
57
340
684

0

votes

1 answer

Is it possible to execute more than one CUDA graph's host execution node in different streams concurrently?

Investigating possible solutions for this problem, I thought about using CUDA graphs' host execution nodes (cudaGraphAddHostNode). I was hoping to have the option to block and unblock streams on the host side instead of the device side with the wait…

cuda synchronization gpgpu cuda-streams cuda-graphs

asked Mar 15 '23 at 02:17

surabax

15
5

0

votes

1 answer

What is cuEventRecord guaranteed to do if it gets the default-stream's handle?

Suppose I call cuEventRecord(0, my_event_handle). cuEventRecord() requires the stream and the event to belong to the same context. Now, one can interpret the 0 as "the default stream in the appropriate context" - the requirements are satisfied and…

cuda-streams cuda-driver cuda-events

asked Apr 13 '22 at 16:42

einpoklum

118,144
57
340
684

0

votes

1 answer

How can I make sure two kernels in two streams are sent to the GPU at the same time to run?

I am beginner in CUDA. I am using NVIDIA Geforce GTX 1070 and CUDA toolkit 11.3 and ubuntu 18.04. As shown in the code below, I use two CPU threads to send two kernels in the form of two streams to a GPU. I want exactly these two kernels to be sent…

cuda synchronization cuda-streams

asked Dec 01 '21 at 13:43

mehran

191
10

0

votes

1 answer

Reusing cudaEvent to serialize multiple streams

Suppose I have a struct: typedef enum {ON_CPU,ON_GPU,ON_BOTH} memLocation; typedef struct foo *foo; struct foo { cudaEvent_t event; float *deviceArray; float *hostArray; memLocation arrayLocation; }; a function: void…

cuda cuda-streams cuda-events

asked Mar 05 '21 at 20:27

Jacob Faib

1,062
7
22

Questions tagged [cuda-streams]