The huggingface tag can be used for all libraries made by Hugging Face. Please ALWAYS use the more specific tags; huggingface-transformers, huggingface-tokenizers, huggingface-datasets if your question concerns one of those libraries.
Questions tagged [huggingface]
606 questions
0
votes
0 answers
What should be the vocab of the tokenizer?
I am trying to use a tokenizer from huggingface. However, I do not have the vocab.
from tokenizers import BertWordPieceTokenizer , CharBPETokenizer, ByteLevelBPETokenizer
from tokenizers import Tokenizer
text = 'the quick brown fox jumped over the…

Palash Jhamb
- 605
- 6
- 15
0
votes
1 answer
Summarization with Huggingface: How to generate one word at a time?
I am using a DistilBART for abstractive summarization. The method generate() is very straightforward to use. However, it returns complete, finished summaries. What I want is, at each step, access the logits to then get the list of next-word…

Diego Miguel
- 531
- 4
- 13
0
votes
2 answers
Adding a pretrained model outside of AllenNLP to the AllenNLP demo
I am working on the interpretability of models. I want to use AllenAI demo to check the saliency maps and adversarial attack methods (implemented in this demo) on some other models. I use the tutorial here and run the demo on my local machine. Now…

Roshanak
- 1
0
votes
0 answers
Huggingface distilbert-base-uncased-finetuned-sst-2-english runs out of ram with only a few kb?
My dataset is only 10 thousand sentences. I run it in batches of 100, and clear the memory on each run. I manually slice the sentences to only 50 characters. After running for 32 minutes, it crashes... On google colab with 25 gigs of ram.
I must be…
0
votes
0 answers
How to predict results from 20 million records using Hugging Face Model in minimum time
I am trying to predict sentiment for 20 million records using the model available in Hugging Face.
https://huggingface.co/finiteautomata/beto-sentiment-analysis
This model takes 1 hour and 20 minutes to predict 70000 records.
The model is saved…

Kumar
- 125
- 1
- 9
0
votes
1 answer
Detect inquiry sentence in Wav2Vec 2.0 result
I am studying ASR(Automatic Speech Recognition) using Wav2Vec2.0.
When I run Wav2Vec2.0, I get the result without a comma("."), question mark("?") etc. Therefore, the result came out as one whole sentence.
I know that I removed regex while making…

Giseok Ryu
- 15
- 3
0
votes
1 answer
Which huggingface model is the best for sentence as input and a word from that sentence as the output?
What would be the best huggingface model to fine-tune for this type of task:
Example input 1:
If there's one person you don't want to interrupt in the middle of a sentence it's a judge.
Example output 1:
sentence
Example input 2:
A good baker will…

madhatter
- 15
- 3
0
votes
0 answers
ValueError: `Checkpoint` was expecting model to be a trackable object (an object derived from `Trackable`), got RobertaForSequenceClassification
I am training a text classification model using RoBERTa. (https://huggingface.co/siebert/sentiment-roberta-large-english)
I use google colab for running the code.
I am facing a valueError i have trid googling about this error and i have not gotten a…
0
votes
3 answers
How to enable header in text files of load_dataset in huggingface?
I am trying to load a text file using huggingface (https://huggingface.co/docs/datasets/v1.2.1/loading_datasets.html)
from datasets import load_dataset
dataset = load_dataset('text', data_files='my_file.txt')
This text file already contains…

Sachin
- 239
- 3
- 13
0
votes
1 answer
For what is used parameter return_dict in BertModel?
I have something like this model=BertModel.from_pretrained('bert-base-uncased',return_dict=True)
What exactly is this "return_dict" used for? What happens when True and what when False?

Alem
- 283
- 1
- 13
0
votes
1 answer
wandb getting logged without initiating
I do not want to use wandb. I don't even have an account. I am simply following this notebook for finetuning. I am not running the 2nd and 3 cells because I do not want to push the model to the hub.
However, when I do trainer.train() I get the…

Kiera.K
- 317
- 1
- 13
0
votes
1 answer
Is there a way to load local csv.gz file in huggingface dataloader?
I'm using datasets library by huggingface to load csv dataset stored locally. The problem is, the dataset is compressed and is stored as a csv.gz file. Therefore, I'm not able to load it using load_dataset('csv', '.csv') method in…

Mahavir I
- 93
- 6
0
votes
1 answer
HuggingFace FinBert Model in google Collab
When I run my FinBert model it always crashes the RAM in Google Collab at outputs = model(**input)
from transformers.utils.dummy_pt_objects import HubertModel
import textwrap
# Reads all files at once but you will have to upload it again
import…

Vishrut Tiwari
- 1
- 1
0
votes
2 answers
Huggingface datasets ValueError
I am trying to load a dataset from huggingface organization, but I am getting the following error:
ValueError: Couldn't cast string
-- schema metadata --
pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, "' + 686
to
{'text':…

TMN
- 63
- 1
- 2
- 10
-1
votes
0 answers
Huggingface's 'track-anything' returns "Found no NVIDIA driver on your system" even though I have a GeForce GTX 1080 Ti installed with latest driver
I'm trying to use the new track-anything app on Huggingface (https://huggingface.co/spaces/VIPLab/Track-Anything), but it fails to launch due to the error outlined in the title. This is the full log:
[notice] A new release of pip available: 22.3.1…

Sean Moran
- 1
- 1