2

I would expect DataLoader to load batches concurrently to the main process, filling up the buffer as soon as a batch is consumed from the buffer. However, when I track the utilization of GPU and order of loading vs execution I see some different behavior:

  1. loading of the whole buffer (expected)
  2. consuming the whole buffer by execution of all the batches until the buffer is empty (not expected)
  3. loading the whole buffer again without parallel execution (not expected)
  4. goto 2.

This obviously results in dips in GPU utilization when in step 3.

I set:

num_workers >= 1

pin_memory = True/False (doesn't influence the described behavior)

Did anyone experience the same? What could be the issue?

dlsf
  • 332
  • 2
  • 13
  • 1
    I found the reason for that behavior and found a solution at: https://discuss.pytorch.org/t/pytorch-dataloader-pipelining-not-working/66954 – dlsf Feb 05 '20 at 15:07

0 Answers0