I have beening using shuffle option for pytorch dataloader for many times. But I was wondering when this shuffle happens and whether it is performed dynamically during iteration. Take the following code as an example:
namesDataset = NamesDataset()
namesTrainLoader = DataLoader(namesDataset, batch_size=16, shuffle=True)
for batch_data in namesTrainLoader:
print(batch_data)
When we define "namesTrainLoader", does that mean the shuffling is finished and the following iteration will be based on a fixed order of data? Will there be any randomness in the for loop after namesTrainLoader was defined?
I was trying to replace half of "batch_data" with some special value:
for batch_data in namesTrainLoader:
batch_data[:8] = special_val
pre = model(batch_data)
Let us say there will be infinite number of epoches, will "model" eventually see all the data in "namesTrainLoader"? Or half of the data of "namesTrainLoader" is actually lost to "model"?