Pytorch DataLoader pipelining

Asked Jan 15 '20 at 11:30

Active Jan 16 '20 at 16:11

Viewed 104 times

I would expect DataLoader to load batches concurrently to the main process, filling up the buffer as soon as a batch is consumed from the buffer. However, when I track the utilization of GPU and order of loading vs execution I see some different behavior:

loading of the whole buffer (expected)
consuming the whole buffer by execution of all the batches until the buffer is empty (not expected)
loading the whole buffer again without parallel execution (not expected)
goto 2.

This obviously results in dips in GPU utilization when in step 3.

I set:

num_workers >= 1

pin_memory = True/False (doesn't influence the described behavior)

Did anyone experience the same? What could be the issue?

edited Jan 16 '20 at 16:11

asked Jan 15 '20 at 11:30

dlsf

1

I found the reason for that behavior and found a solution at: https://discuss.pytorch.org/t/pytorch-dataloader-pipelining-not-working/66954 – dlsf Feb 05 '20 at 15:07

Pytorch DataLoader pipelining

0 Answers0