Questions tagged [torchdata]

7 questions
2
votes
0 answers

How to create a custom parallel corpus for machine translation with recent versions of pytorch and torchtext?

I am trying to train a model for NMT on a custom dataset. I found this great tutorial on youtube along with the accompanying repo, but it uses an old version of PyTorch and torchtext. More recent versions of torchtext have removed the Field and…
1
vote
1 answer

How to add custom labels to a torchdata datapipe?

I am trying to load image data for model training from a self-hosted S3 storage (MinIO). Pytorch provides new datapipes with this functionality in the torchdata library. So within my function to create the datapipe, I have these lines: dp_s3 =…
Roland Deschain
  • 2,211
  • 19
  • 50
1
vote
1 answer

PyTorch Datapipes and how does overwriting the datapipe classes work?

Pytorch Datapipes are a new inplace dataset loaders for large data that can be fed into Pytorch models through streaming, for reference these are Official Doc: https://pytorch.org/data/main/tutorial.html A crash-course post explaining the usage…
alvas
  • 115,346
  • 109
  • 446
  • 738
0
votes
1 answer

Repeat batched elements in-epoch during training

I am training a (siamense) neural network with Pytorch on a very big dataset. Loading data is the biggest bottleneck, and my dataset doesn't fit in RAM to speed it up. What I would like to do is basically cache part of the data, and repeat it inside…
rmeertens
  • 4,383
  • 3
  • 17
  • 42
0
votes
1 answer

How to properly make a train/test split using `torchdata`?

I've been using the torchdata library (v0.6.0) to construct datapipes for my machine learning model, but I can't seem to figure out how torchdata expects its users to make a train/test split. Supposing I have a datapipe dp, my first attempt was to…
user3002473
  • 4,835
  • 8
  • 35
  • 61
0
votes
1 answer

Exception: Unable to add DataPipe function name sharding_filter as it is already taken

torchdata.datapipes is not working in Google Colab. Even after installing the torchdata library, it raises an exception when datapipes function are imported. I installed the dependencies !pip install torchdata or !pip install --pre torchdata -f…
SSK
  • 11
  • 4