The huggingface tag can be used for all libraries made by Hugging Face. Please ALWAYS use the more specific tags; huggingface-transformers, huggingface-tokenizers, huggingface-datasets if your question concerns one of those libraries.
Questions tagged [huggingface]
606 questions
2
votes
0 answers
Black images or memory issue with Hugging Face StableDiffusion pipleline, M1 Pro, PyTorch
So I'm making a project for a school that offers image generation using stable diffusion. It was working perfectly fine until I upgraded the Pytorch version for "stabilityai/stable-diffusion-x4-upscaler" model. Since then all the other images…

itsDanial
- 105
- 1
- 10
2
votes
1 answer
XLNet or BERT Chinese for HuggingFace AutoModelForSeq2SeqLM Training
I want to use the pre-trained XLNet (xlnet-base-cased, which the model type is Text Generation) or BERT Chinese (bert-base-chinese, which the model type is Fill Mask) for Sequence to Sequence Language Model (Seq2SeqLM) training.
I can use…

Raptor
- 53,206
- 45
- 230
- 366
2
votes
1 answer
What is the correct way to create a feature extractor for a hugging face (HF) ViT model?
TLDR: is the correct way to extract features from a HF ViT model outputs.pooler_output or outputs.last_hidden_state[:, 0]? where outputs is outputs: BaseModelOutputWithPooling = self.model(pixel_values=batch_xs).
Given only the ViT model it's not…

Charlie Parker
- 5,884
- 57
- 198
- 323
2
votes
0 answers
PyTorch with Transformer - finetune GPT2 throws index out of range Error
in my Jupiter i have the following code. I can not figure out why this throws a IndexError: index out of range in self error.
here ist the code:
!pip install torch
!pip install torchvision
!pip install transformers
import torch
from…

Peter Shaw
- 1,867
- 1
- 19
- 32
2
votes
1 answer
Not able to use map() or select(range()) with Huggingface Dataset library, gives dill_.dill has no attribute log
I'm not able to do dataset.map() or dataset.select(range(10)) with huggingface Datasets library in colab. It says dill_.dill has no attribute log
I have tried with different dill versions, but no luck.
I tried with older versions of dill lib but…

user21401461
- 41
- 3
2
votes
1 answer
How to resolve "the size of tensor a (1024) must match the size of tensor b" in happytransformer
I have the following code. This code uses the GPT-2 language model from the Transformers library to generate text from a given input text. The input text is split into smaller chunks of 1024 tokens, and then the GPT-2 model is used to generate text…

littleworth
- 4,781
- 6
- 42
- 76
2
votes
1 answer
How to calculate image similarity of given 2 images by using open AI Clip model - which method / AI model is best for calculating image similarity?
I have prepared a small example code but It is throwing error. Can't solve the problem because it is supposed to work.
Also do you think are there any better approaches to calculate image similarity? I want to find similar cloth images. e.g. I will…

Furkan Gözükara
- 22,964
- 77
- 205
- 342
2
votes
0 answers
How to ensure last token in sequence is end-of-sequence token?
I am using the gpt2 model from huggingface's transformers library. When tokenizing, I would like all sequences to end in the end-of-sequence (EOS) token. How can I do this?
An easy solution is to manually append the EOS token to each sequence in a…

BioBroo
- 613
- 1
- 7
- 21
2
votes
0 answers
Is there is a way that I can download only a part of the dataset from huggingface?
I'm trying to load (peoples speech) dataset, but it's way too big, is there's a way to download only a part of it?
from datasets import load_dataset
from datasets import load_dataset
train = load_dataset("MLCommons/peoples_speech",…

FOXASDF
- 43
- 3
2
votes
1 answer
How to define prompt weights to huggingface's diffusers.StableDiffusionInpaintPipeline?
I am tweaking a python script using diffusers inpainting pipeline for a custom video generation idea.
I would like to gradually shift the weights of certain words in the prompt.
As I understand the argument prompt_embeds is exactly what i need.
I…

Bálint Komjáti
- 33
- 4
2
votes
1 answer
Arrow related error when pushing dataset to Hugging-face hub
i have quite a problem with my dataset:
The (future) dataset is a pandas dataframe that i loaded from a pickle file, the pandas dataset behaves correctly. My code is:
dataset.from_pandas(df)
dataset.push_to_hub("username/my_dataset",…

Tsadoq
- 224
- 3
- 17
2
votes
0 answers
Generating text word by word for transformers
I’m currently using GPT-J for generating text as shown below. This works well but it takes up to 5 seconds to generate the 100 tokens.
Is it possible to do the generation word by word or sentence by sentence? Similar to what ChatGPT is doing…

BlackHawk
- 719
- 1
- 6
- 18
2
votes
0 answers
How can I hide my source code in HuggingFace spaces?
I've been trying to figure out how to hide my source code in public HuggingFace spaces, but I wasn't able to find a solution for this. Here is what I've read and tried:
Using Google Actions to rely on tokens, but the code is synced of course to the…

Esraa Abdelmaksoud
- 1,307
- 12
- 25
2
votes
1 answer
AttributeError: module 'dill._dill' has no attribute 'log'
I am using a python nlp module to train a dataset and ran into the following error:
File "/usr/local/lib/python3.9/site-packages/nlp/utils/py_utils.py", line 297, in save_code
dill._dill.log.info("Co: %s" % obj)
AttributeError: module…

Fernando
- 57
- 1
- 4
2
votes
0 answers
Reducing Latency for GPT-J
I'm using GPT-J locally on a Nvidia RTX 3090 GPU. Currently, I'm using the model in the following way:
config = transformers.GPTJConfig.from_pretrained("EleutherAI/gpt-j-6B")
tokenizer =…

BlackHawk
- 719
- 1
- 6
- 18