The huggingface tag can be used for all libraries made by Hugging Face. Please ALWAYS use the more specific tags; huggingface-transformers, huggingface-tokenizers, huggingface-datasets if your question concerns one of those libraries.
Questions tagged [huggingface]
606 questions
0
votes
0 answers
Loading Data using hugging face giving error
I am working on common voice dataset but when I try to execute the following code, it doesn't work.
from datasets import load_dataset, load_metric
common_voice_train = load_dataset("common_voice", "id", split="train+validation")
common_voice_test =…
0
votes
0 answers
How can I control data feeding order to model using Huggingface Trainer?
I want to train model in the order in which the data are stored.
For example, if there are 100 data, then I want to feed 1st, 2nd data together(because I set batch_size=2 in code) and then 3rd, 4th data and then 5th, 6th data together and so…

DDANGEO
- 13
- 5
0
votes
1 answer
Can't use BloomAI locally
So I just finished installing Bloom's model from Huggingface & I tried to run it in my notebook.
Here's the code:
from transformers import AutoTokenizer, AutoModel
model_path = "D:/bloom"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model =…

duro
- 35
- 3
0
votes
0 answers
Huggingface progress bars shown despite disable_tqdm=True in Trainer
I'm running HuggingFace Trainer with TrainingArguments(disable_tqdm=True, ...) for fine-tuning the EleutherAI/gpt-j-6B model but there are still progress bars displayed (please see picture below). Does somebody know how to remove these progress…

BlackHawk
- 719
- 1
- 6
- 18
0
votes
0 answers
AWS lambda error when loading Transformers model
I made a post earlier about an error which got fixed but lead to another error. So I created code in python to summarize news articles and the code works on both my laptop and after creating a Docker image for it.
try:
from bs4 import…

shkadov
- 1
- 1
0
votes
0 answers
Error: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"
I'm running this code on Azure machine learning notebook.
import os
import torch
import gradio as gr
from vilmedic import AutoModel
from vilmedic.blocks.scorers import RadGraph
import glob
model, processor =…

Mhmd Rokaimi
- 27
- 6
0
votes
1 answer
Is there a way to have the same ordering of labels for HuggingFace transformers pipeline?
This is the output I get after I append the outputs of a pipeline using facebook-bart-large-mnli model.
[{'labels': ['recreation', 'entertainment', 'travel', 'dining'],
'scores': [0.8873, 0.1528, 0.0002, 0.0001],
'sequence': 'laundromat'},
…

flighted
- 11
- 2
0
votes
1 answer
Cannot get DataCollator to prepare tf dataset
I’m trying to follow this tutorial to fine-tune bert for a NER task using my own dataset. https://www.philschmid.de/huggingface-transformers-keras-tf. Below is my shortened code, and the error due to the last line of the code. I’m new to all these,…

peetal
- 1
0
votes
0 answers
How to use gradio dataframe as output for an interface
So I have a scraper that collects data from different sources and them returns the result as a dataframe. I currently want to create a Gradio interface that given a user input would returns the resulting dataframe as…

i'mgnome
- 483
- 1
- 3
- 17
0
votes
1 answer
sentiment140 dataset doesn't contain label 2 i.e neutral sentences when uploading it from HuggingFace
I want to work with the sentiment140 dataset for a sentiment analysis task, as I saw that it contains normally the following labels :
0, 4 for pos and neg sentences
2 for neutral sentences
which I found when looking at the dataset on their website…

Khadija
- 72
- 1
- 8
0
votes
1 answer
without the encode_plus method in tokenizers, how to make a feature matrix
I am working on a low-resource language and need to make a classifier.
I used the tokenizers library to train the following tokenizers: WLV, BPE, UNI, WPC. I have saved the result of each into a json file.
I load each of the tokenizers using…

Areza
- 5,623
- 7
- 48
- 79
0
votes
1 answer
Unable to get Camel case tokens after tokenization in huggingface
I am trying to tokenize text by loading a vocab in huggingface.
vocab_path = '....' ## have a local vocab path
tokenizer = BertWordPieceTokenizer(os.path.join(vocab_path, "vocab.txt"), lowercase=False)
text = 'The Quick Brown fox'
output =…

Palash Jhamb
- 605
- 6
- 15
0
votes
1 answer
Getting KeyErrors when training Hugging Face Transformer
I am generally following this tutorial (https://huggingface.co/docs/transformers/training#:~:text=%F0%9F%A4%97%20Transformers%20provides%20access%20to,an%20incredibly%20powerful%20training%20technique.) to implement fine-tuning on a pretrained…

Ben O
- 228
- 1
- 15
0
votes
1 answer
Try to run an NLP model with an Electra instead of a BERT model
I want to run the wl-coref model with an Electra model instead of a Bert model. However, I get an error message with the Electra model and can't find a hint in the Huggingface documentation on how to fix it.
I try different BERT models such like…

werdas34
- 11
- 1
0
votes
1 answer
Is it possible to perform local dev on a CPU-only machine on HF/sagemaker?
I'm trying to dev locally on sagemaker.huggingface.HuggingFace before moving to sagemaker for actual training. I set up a
HF_estimator = HuggingFace(entry_point='train.py', instance_type='local' ...)
And called HF_estimator.fit()
In train.py im…

plamb
- 5,636
- 1
- 18
- 31