Questions tagged [huggingface]

The huggingface tag can be used for all libraries made by Hugging Face. Please ALWAYS use the more specific tags; huggingface-transformers, huggingface-tokenizers, huggingface-datasets if your question concerns one of those libraries.

606 questions
0
votes
0 answers

What should be the vocab of the tokenizer?

I am trying to use a tokenizer from huggingface. However, I do not have the vocab. from tokenizers import BertWordPieceTokenizer , CharBPETokenizer, ByteLevelBPETokenizer from tokenizers import Tokenizer text = 'the quick brown fox jumped over the…
0
votes
1 answer

Summarization with Huggingface: How to generate one word at a time?

I am using a DistilBART for abstractive summarization. The method generate() is very straightforward to use. However, it returns complete, finished summaries. What I want is, at each step, access the logits to then get the list of next-word…
0
votes
2 answers

Adding a pretrained model outside of AllenNLP to the AllenNLP demo

I am working on the interpretability of models. I want to use AllenAI demo to check the saliency maps and adversarial attack methods (implemented in this demo) on some other models. I use the tutorial here and run the demo on my local machine. Now…
0
votes
0 answers

Huggingface distilbert-base-uncased-finetuned-sst-2-english runs out of ram with only a few kb?

My dataset is only 10 thousand sentences. I run it in batches of 100, and clear the memory on each run. I manually slice the sentences to only 50 characters. After running for 32 minutes, it crashes... On google colab with 25 gigs of ram. I must be…
0
votes
0 answers

How to predict results from 20 million records using Hugging Face Model in minimum time

I am trying to predict sentiment for 20 million records using the model available in Hugging Face. https://huggingface.co/finiteautomata/beto-sentiment-analysis This model takes 1 hour and 20 minutes to predict 70000 records. The model is saved…
0
votes
1 answer

Detect inquiry sentence in Wav2Vec 2.0 result

I am studying ASR(Automatic Speech Recognition) using Wav2Vec2.0. When I run Wav2Vec2.0, I get the result without a comma("."), question mark("?") etc. Therefore, the result came out as one whole sentence. I know that I removed regex while making…
0
votes
1 answer

Which huggingface model is the best for sentence as input and a word from that sentence as the output?

What would be the best huggingface model to fine-tune for this type of task: Example input 1: If there's one person you don't want to interrupt in the middle of a sentence it's a judge. Example output 1: sentence Example input 2: A good baker will…
madhatter
  • 15
  • 3
0
votes
0 answers

ValueError: `Checkpoint` was expecting model to be a trackable object (an object derived from `Trackable`), got RobertaForSequenceClassification

I am training a text classification model using RoBERTa. (https://huggingface.co/siebert/sentiment-roberta-large-english) I use google colab for running the code. I am facing a valueError i have trid googling about this error and i have not gotten a…
0
votes
3 answers

How to enable header in text files of load_dataset in huggingface?

I am trying to load a text file using huggingface (https://huggingface.co/docs/datasets/v1.2.1/loading_datasets.html) from datasets import load_dataset dataset = load_dataset('text', data_files='my_file.txt') This text file already contains…
Sachin
  • 239
  • 3
  • 13
0
votes
1 answer

For what is used parameter return_dict in BertModel?

I have something like this model=BertModel.from_pretrained('bert-base-uncased',return_dict=True) What exactly is this "return_dict" used for? What happens when True and what when False?
Alem
  • 283
  • 1
  • 13
0
votes
1 answer

wandb getting logged without initiating

I do not want to use wandb. I don't even have an account. I am simply following this notebook for finetuning. I am not running the 2nd and 3 cells because I do not want to push the model to the hub. However, when I do trainer.train() I get the…
0
votes
1 answer

Is there a way to load local csv.gz file in huggingface dataloader?

I'm using datasets library by huggingface to load csv dataset stored locally. The problem is, the dataset is compressed and is stored as a csv.gz file. Therefore, I'm not able to load it using load_dataset('csv', '.csv') method in…
Mahavir I
  • 93
  • 6
0
votes
1 answer

HuggingFace FinBert Model in google Collab

When I run my FinBert model it always crashes the RAM in Google Collab at outputs = model(**input) from transformers.utils.dummy_pt_objects import HubertModel import textwrap # Reads all files at once but you will have to upload it again import…
0
votes
2 answers

Huggingface datasets ValueError

I am trying to load a dataset from huggingface organization, but I am getting the following error: ValueError: Couldn't cast string -- schema metadata -- pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, "' + 686 to {'text':…
TMN
  • 63
  • 1
  • 2
  • 10
-1
votes
0 answers

Huggingface's 'track-anything' returns "Found no NVIDIA driver on your system" even though I have a GeForce GTX 1080 Ti installed with latest driver

I'm trying to use the new track-anything app on Huggingface (https://huggingface.co/spaces/VIPLab/Track-Anything), but it fails to launch due to the error outlined in the title. This is the full log: [notice] A new release of pip available: 22.3.1…