Questions tagged [torchtext]

torchtext is PyTorch text library, which provides data loaders and abstractions for natural language processing.

torchtext

The PyTorch torchtext package consists of data processing utilities and popular datasets for natural language.

Official resources

158 questions
24
votes
4 answers

Torchtext 0.7 shows Field is being deprecated. What is the alternative?

Looks like the previous paradigm of declaring Fields, Examples and using BucketIterator is deprecated and will move to legacy in 0.8. However, I don't seem to be able to find an example of the new paradigm for custom datasets (as in, not the ones…
Paco
  • 443
  • 3
  • 10
10
votes
1 answer

How to create a torchtext.data.TabularDataset directly from a list or dict

torchtext.data.TabularDataset can be created from a TSV/JSON/CSV file and then it can be used for building the vocabulary from Glove, FastText or any other embeddings. But my requirement is to create a torchtext.data.TabularDataset directly, either…
Arjun Sankarlal
  • 2,655
  • 1
  • 9
  • 18
9
votes
5 answers

torchtext ImportError in colab

I am trying to run this tutorial in colab. However, when I try to import a bunch of modules: import io import torch from torchtext.utils import download_from_url from torchtext.data.utils import get_tokenizer from torchtext.vocab import…
9
votes
2 answers

Dataframe as datasource in torchtext

I have a dataframe, which has two columns (review and sentiment). I am using pytorch and torchtext library for preprocessing data. Is it possible to use dataframe as source to read data from, in torchtext? I am looking for something similar to, but…
Newbie
  • 530
  • 1
  • 10
  • 21
8
votes
1 answer

OverflowError: Python int too large to convert to C long torchtext.datasets.text_classification.DATASETS['AG_NEWS']()

I have 64 bit windows 10 OS I have installed python 3.6.8 I have installed torch and torchtext using pip. torch version is 1.2.0 I am trying to load AG_NEWS dataset using below code: import torch import torchtext from torchtext.datasets import…
Pramod Patil
  • 757
  • 2
  • 10
  • 26
8
votes
1 answer

BucketIterator throws 'Field' object has no attribute 'vocab'

It's not a new question, references I found without any solution working for me first and second. I'm a newbie to PyTorch, facing AttributeError: 'Field' object has no attribute 'vocab' while creating batches of the text data in PyTorch using…
Asif Ali
  • 1,422
  • 2
  • 12
  • 28
7
votes
5 answers

How to install torchtext 0.4.0 on conda

The torchtext 0.4.0 library exists (can be downloaded thru pip), but conda install torchtext=0.4.0 will not work. How can I download torchtext to a anaconda environment?
gust
  • 878
  • 9
  • 23
6
votes
3 answers

Prunning model doesn't improve inference speed or reduce model size

I'm trying to prune my model in PyTorch with torch.nn.utils.prune, which provides 2 tensors, one is the original weight and the other is a mask contain 0s and 1s that help us close certain connections in the network. I have tried both of the…
manaclan
  • 816
  • 9
  • 20
5
votes
2 answers

AttributeError: module 'torchtext' has no attribute 'legacy'

I am trying to use torchtext to process test data, however, I get the error: "AttributeError: module 'torchtext' has no attribute 'legacy'", when I run the following code. Can anyone please guide me what the issue here? I am using python 3.10.4.…
Emrul
  • 51
  • 1
  • 4
5
votes
3 answers

TorchText Vocab TypeError: Vocab.__init__() got an unexpected keyword argument 'min_freq'

I am working on a CNN Sentiment analysis machine learning model which uses the IMDb dataset provided by the Torchtext library. On one of my lines of code vocab = Vocab(counter, min_freq = 1, specials=('\', '\', '\', '\')) I…
James B
  • 53
  • 1
  • 4
5
votes
1 answer

AttributeError:module 'torchtext.data' has no attribute 'TabularDataset'

I want to create a dataset from a tsv file with pytorch. I was thinking of using torchtext.data.TabularDataset.splits but I'm getting an error message. AttributeError:module 'torchtext.data' has no attribute 'TabularDataset'
ryotoitoi
  • 51
  • 1
  • 3
5
votes
2 answers

Torchtext AttributeError: 'Example' object has no attribute 'text_content'

I'm working with RNN and using Pytorch & Torchtext. I've got a problem with building vocab in my RNN. My code is as follows: TEXT = Field(tokenize=tokenizer, lower=True) LABEL = LabelField(dtype=torch.float) trainds = TabularDataset( …
5
votes
4 answers

how to save torchtext Dataset?

I'm working with text and use torchtext.data.Dataset. Creating the dataset takes a considerable amount of time. For just running the program this is still acceptable. But I would like to debug the torch code for the neural network. And if python is…
lhk
  • 27,458
  • 30
  • 122
  • 201
5
votes
1 answer

Iterating over Torchtext.data.BucketIterator object throws AttributeError 'Field' object has no attribute 'vocab'

When I try to look into a batch, by printing the next iteration of the BucketIterator object, the AttributeError is thrown. tv_datafields=[("Tweet",TEXT), ("Anger",LABEL), ("Fear",LABEL), ("Joy",LABEL), ("Sadness",LABEL)] train, vld =…
EinAeffchen
  • 90
  • 1
  • 7
4
votes
2 answers

Unable to build vocab for a torchtext text classification

I'm trying to prepare a custom dataset loaded from a csv file in order to use in a torchtext text binary classification problem. It's a basic dataset with news headlines and a market sentiment label assigned "positive" or "negative". I've been…
suiprocs1
  • 53
  • 5
1
2 3
10 11