Questions tagged [huggingface]

The huggingface tag can be used for all libraries made by Hugging Face. Please ALWAYS use the more specific tags; huggingface-transformers, huggingface-tokenizers, huggingface-datasets if your question concerns one of those libraries.

606 questions
0
votes
0 answers

Loading Data using hugging face giving error

I am working on common voice dataset but when I try to execute the following code, it doesn't work. from datasets import load_dataset, load_metric common_voice_train = load_dataset("common_voice", "id", split="train+validation") common_voice_test =…
0
votes
0 answers

How can I control data feeding order to model using Huggingface Trainer?

I want to train model in the order in which the data are stored. For example, if there are 100 data, then I want to feed 1st, 2nd data together(because I set batch_size=2 in code) and then 3rd, 4th data and then 5th, 6th data together and so…
DDANGEO
  • 13
  • 5
0
votes
1 answer

Can't use BloomAI locally

So I just finished installing Bloom's model from Huggingface & I tried to run it in my notebook. Here's the code: from transformers import AutoTokenizer, AutoModel model_path = "D:/bloom" tokenizer = AutoTokenizer.from_pretrained(model_path) model =…
0
votes
0 answers

Huggingface progress bars shown despite disable_tqdm=True in Trainer

I'm running HuggingFace Trainer with TrainingArguments(disable_tqdm=True, ...) for fine-tuning the EleutherAI/gpt-j-6B model but there are still progress bars displayed (please see picture below). Does somebody know how to remove these progress…
BlackHawk
  • 719
  • 1
  • 6
  • 18
0
votes
0 answers

AWS lambda error when loading Transformers model

I made a post earlier about an error which got fixed but lead to another error. So I created code in python to summarize news articles and the code works on both my laptop and after creating a Docker image for it. try: from bs4 import…
0
votes
0 answers

Error: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"

I'm running this code on Azure machine learning notebook. import os import torch import gradio as gr from vilmedic import AutoModel from vilmedic.blocks.scorers import RadGraph import glob model, processor =…
0
votes
1 answer

Is there a way to have the same ordering of labels for HuggingFace transformers pipeline?

This is the output I get after I append the outputs of a pipeline using facebook-bart-large-mnli model. [{'labels': ['recreation', 'entertainment', 'travel', 'dining'], 'scores': [0.8873, 0.1528, 0.0002, 0.0001], 'sequence': 'laundromat'}, …
flighted
  • 11
  • 2
0
votes
1 answer

Cannot get DataCollator to prepare tf dataset

I’m trying to follow this tutorial to fine-tune bert for a NER task using my own dataset. https://www.philschmid.de/huggingface-transformers-keras-tf. Below is my shortened code, and the error due to the last line of the code. I’m new to all these,…
0
votes
0 answers

How to use gradio dataframe as output for an interface

So I have a scraper that collects data from different sources and them returns the result as a dataframe. I currently want to create a Gradio interface that given a user input would returns the resulting dataframe as…
i'mgnome
  • 483
  • 1
  • 3
  • 17
0
votes
1 answer

sentiment140 dataset doesn't contain label 2 i.e neutral sentences when uploading it from HuggingFace

I want to work with the sentiment140 dataset for a sentiment analysis task, as I saw that it contains normally the following labels : 0, 4 for pos and neg sentences 2 for neutral sentences which I found when looking at the dataset on their website…
0
votes
1 answer

without the encode_plus method in tokenizers, how to make a feature matrix

I am working on a low-resource language and need to make a classifier. I used the tokenizers library to train the following tokenizers: WLV, BPE, UNI, WPC. I have saved the result of each into a json file. I load each of the tokenizers using…
Areza
  • 5,623
  • 7
  • 48
  • 79
0
votes
1 answer

Unable to get Camel case tokens after tokenization in huggingface

I am trying to tokenize text by loading a vocab in huggingface. vocab_path = '....' ## have a local vocab path tokenizer = BertWordPieceTokenizer(os.path.join(vocab_path, "vocab.txt"), lowercase=False) text = 'The Quick Brown fox' output =…
0
votes
1 answer

Getting KeyErrors when training Hugging Face Transformer

I am generally following this tutorial (https://huggingface.co/docs/transformers/training#:~:text=%F0%9F%A4%97%20Transformers%20provides%20access%20to,an%20incredibly%20powerful%20training%20technique.) to implement fine-tuning on a pretrained…
Ben O
  • 228
  • 1
  • 15
0
votes
1 answer

Try to run an NLP model with an Electra instead of a BERT model

I want to run the wl-coref model with an Electra model instead of a Bert model. However, I get an error message with the Electra model and can't find a hint in the Huggingface documentation on how to fix it. I try different BERT models such like…
0
votes
1 answer

Is it possible to perform local dev on a CPU-only machine on HF/sagemaker?

I'm trying to dev locally on sagemaker.huggingface.HuggingFace before moving to sagemaker for actual training. I set up a HF_estimator = HuggingFace(entry_point='train.py', instance_type='local' ...) And called HF_estimator.fit() In train.py im…
plamb
  • 5,636
  • 1
  • 18
  • 31