How to sample along with another dataloader in PyTorch

Asked Jan 23 '21 at 12:32

Active Jan 23 '21 at 15:37

Viewed 204 times

Assume I have train/valid/test dataset with batch_size and shuffleed as normal.

When I do train/valid/test, I want to sample a certain number (called memory_size) of new samples from the entire dataset for each sample.

For example, I set batch_size as 256, let dataset shuffled, and memory_size as 80. In every forward step, not only use each sample from dataset, but sample data from entire original dataset which size is memory_size and I want to use it inside forward. Let new samples as Memory (Yeah, I want to adopt idea from Memory Networks). Memory can be overlapped between each sample in train set.

I'm using PyTorch and PyTorch-Lightning. Can I create new memory dataloader per each train_dataloader, val_dataloader, and test_dataloader then load it with original dataloader? or is there a better way to achieve what I want?

edited Jan 23 '21 at 15:37

asked Jan 23 '21 at 12:32

Jongsu Liam Kim

Do you mean kfold crossvalidation? – Fredrik Feb 21 '21 at 20:57
I thought it was not, but eventually it is similar idea. I didn't implement proposed idea, but creating another memory dataset and sampling from it would solve this issue. – Jongsu Liam Kim Apr 07 '21 at 19:49

How to sample along with another dataloader in PyTorch

0 Answers0