Questions tagged [pytorch-dataloader]

445 questions
24
votes
3 answers

AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute 'next'

I am trying to load the dataset using Torch Dataset and DataLoader, but I got the following error: AttributeError: '_MultiProcessingDataLoaderIter' object has no attribute 'next' the code I use is: class WineDataset(Dataset): def…
Adham Enaya
  • 780
  • 1
  • 11
  • 29
11
votes
1 answer

PyTorch: while loading batched data using Dataloader, how to transfer the data to GPU automatically

If we use a combination of the Dataset and Dataloader classes (as shown below), I have to explicitly load the data onto the GPU using .to() or .cuda(). Is there a way to instruct the dataloader to do it automatically/implicitly? Code to…
anurag
  • 1,715
  • 1
  • 8
  • 28
9
votes
2 answers

Weighted random sampler - oversample or undersample?

Problem I am training a deep learning model in PyTorch for binary classification, and I have a dataset containing unbalanced class proportions. My minority class makes up about 10% of the given observations. To avoid the model learning to just…
clueless
  • 211
  • 2
  • 3
  • 7
8
votes
4 answers

pytorch torchvision.datasets.ImageFolder FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints

Tried to load training data with pytorch torch.datasets.ImageFolder in Colab. transform = transforms.Compose([transforms.Resize(400), transforms.ToTensor()]) dataset_path = 'ss/' dataset =…
8
votes
2 answers

How does PyTorch DataLoader interact with a PyTorch dataset to transform batches?

I'm creating a custom dataset for NLP-related tasks. In the PyTorch custom datast tutorial, we see that the __getitem__() method leaves room for a transform before it returns a sample: def __getitem__(self, idx): if torch.is_tensor(idx): …
rocksNwaves
  • 5,331
  • 4
  • 38
  • 77
7
votes
0 answers

num_worker and prefetch_factor in Pytorch DataLoader do not scale

It seems like serialization and deserialization associated with python's multiprocessing limit the benefits of processing data in parallel. In the following example, I create a custom iterable that returns a numpy array. As the size of the numpy…
kyc12
  • 349
  • 2
  • 15
6
votes
1 answer

pytorch dataloader - RuntimeError: stack expects each tensor to be equal size, but got [157] at entry 0 and [154] at entry 1

I am a beginner with pytorch. I am trying to do an aspect based sentiment analysis. I am facing the error mentioned in the subject. My code is as follows: I request help to resolve this error. Thanks in advance. I will share the entire code and the…
6
votes
3 answers

Is there a PyTorch transform to go from 1 Channel data to 3 Channels?

I am using the emnist data set via the PyTorch datasets together with a neural network that expects a 3 Channel input. I would like to use PyTorch transforms to copy my 1D greyscale into 3D so I can use the same net for 1D and 3D data. Which…
6
votes
0 answers

Optimize pytorch data loader for reading small patches in full HD images

I'm training my neural network using PyTorch framework. The data is full HD images (1920x1080). But in each iteration, I just need to crop out a random 256x256 patch from these images. My network is relatively small (5 conv layers), and hence the…
Nagabhushan S N
  • 6,407
  • 8
  • 44
  • 87
5
votes
2 answers

Efficiently sample batches from only one class at each iteration with PyTorch

I want to train a classifier on ImageNet dataset (1000 classes) and I need each batch to contain 64 images from the same class and consecutive batches from different classes. So far based on @shai's suggestion and this post I have import…
Thoth
  • 993
  • 12
  • 36
5
votes
0 answers

Using pytorch´s dataloader on an "mps" device won´t correctly load images

I´m trying out PyTorch's DCGAN Tutorial, using a Mac with an M1 chip. For some reason, when loading images with the Dataloader, the resulting samples are corrupted when using: device = torch.device("mps"). But when using the device =…
5
votes
1 answer

Custom Sampler correct use in Pytorch

I have a map-stype dataset, which is used for instance segmentation tasks. The dataset is very imbalanced, in the sense that some images have only 10 objects while others have up to 1200. How can I limit the number of objects per batch? A minimal…
5
votes
1 answer

Getting image path through a torchvision dataloader using local images

I want to use a dataloader in my script. normaly the default function call would be like this. dataset = ImageFolderWithPaths( data_dir, transforms.Compose([ transforms.ColorJitter(0.1, 0.1, 0.1, 0.1), …
asdf qwer
  • 49
  • 3
5
votes
1 answer

torch.nn.CrossEntropyLoss over Multiple Batches

I am currently working with torch.nn.CrossEntropyLoss. As far as I know, it is common to compute the loss batch-wise. However, is there a possibility to compute the loss over multiple batches? More concretely, assume we are given the data import…
Vivian
  • 335
  • 2
  • 10
5
votes
0 answers

Efficient way of using numpy memmap when training neural network with pytorch

I'm training a neural network on a database of images. My images are of full HD (1920 x 1080) resolution, but for training, I use random crops of size 256x256. Since reading the full image and then cropping is not efficient, I'm using numpy memmap…
Nagabhushan S N
  • 6,407
  • 8
  • 44
  • 87
1
2 3
29 30