Highest Voted 'huggingface-tokenizers' Questions

0

votes

0 answers

Importing Simple Transformer

Facing this error when I am trying to import simpletransformers. from simpletransformers.classification import ClassificationModel, ClassificationArgs Error: cannot import name 'Unigram' from 'tokenizers.models'…

asked Dec 16 '20 at 07:34

SK Singh

153
1
1
14

0

votes

1 answer

ValueError: logits and labels must have the same shape ((1, 21) vs (21, 1))

I am trying to reproduce this example using huggingface TFBertModel to do a classification task. My model is almost the same of the example, but I'm performing multilabel classification. For this reason, I've performed the binarization of my labels…

python tensorflow keras huggingface-transformers huggingface-tokenizers

asked Dec 01 '20 at 13:42

revy

647
2
10
29

0

votes

0 answers

how to get Single text prediction from the CustomisedBERT Classification + PyTorch NLP model with/without DataLoader

I have used BERT with HuggingFace and PyTorch and used DataLoader, Serializer for Training & Evaluation. Below is the code for that: ! pip install transformers==3.5.1 from transformers import AutoModel, BertTokenizerFast bert =…

pytorch huggingface-transformers dataloader huggingface-tokenizers

asked Dec 01 '20 at 06:52

Deshwal

3,436
4
35
94

0

votes

0 answers

Increase speed Huggingface tokenizer ouput

I need to get the last layer of embeddings from a BERT model using HuggingFace. The following code works, however is extremely slow, how can I increase the speed? This is a toy example, my real data is made of thousands of examples with long…

pytorch bert-language-model huggingface-transformers huggingface-tokenizers

asked Nov 27 '20 at 05:59

Ushuaia81

495
1
6
14

0

votes

0 answers

Pytorch + BERT+ batch_encode_plus() Code running fine in Colab but producing problems with Kaggle in mismatch input shapes

I tried to use a Google Colab initialised Notebook for Kaggle and found a strange behaviour as it gave me something like: 16 # text2tensor ---> 17 train_seq,train_mask,train_y = textToTensor(train_text,train_labels,pad_len) 18 …

pytorch google-colaboratory kaggle huggingface-transformers huggingface-tokenizers

asked Nov 27 '20 at 05:21

Deshwal

3,436
4
35
94

0

votes

0 answers

BERT zero layer fixed word embeddings

I want to do an experiment with bert zero-layer vectors (input vectors), which I understand are of dimension 128. I can not find where I can get a file with the tokens and their vectors. Is there such a thing? Is there a file in the Glove/word2vec…

pytorch word-embedding bert-language-model huggingface-transformers huggingface-tokenizers

asked Oct 06 '20 at 20:30

CSBS

1
1

0

votes

0 answers

How to download the pretrained dataset of huggingface RagRetriever to a custom directory

I'm playing with a RAG example from facebook (huggingface) https://huggingface.co/facebook/rag-token-nq#usage. Here a very nice explanation of it:…

pytorch huggingface-transformers huggingface-tokenizers

asked Oct 05 '20 at 12:06

JoseM LM

373
1
8

0

votes

1 answer

How to make byte level tokenizer not split the token?

I have the text with custom tokens, like: and I am trying to prepare a byte level tokenizer that won't split them: tokenizer.pre_tokenizer = ByteLevel() tokenizer.pre_tokenizer.pre_tokenize("") [('Ġ<', (0, 2)), ('adjective',…

python huggingface-tokenizers

asked Sep 16 '20 at 11:04

artona

1,086
8
13

0

votes

0 answers

"ValueError: You have to specify either input_ids or inputs_embeds" when using Trainer

I am getting "ValueError: You have to specify either input_ids or inputs_embeds" from a seemingly straightforward training example: Iteration: 0%| …

huggingface-transformers huggingface-tokenizers

asked Aug 04 '20 at 04:40

Yevgeniy

1,313
2
13
26

0

votes

1 answer

HuggingFace Transformers: BertTokenizer changing characters

I have downloaded the Norwegian BERT-model from https://github.com/botxo/nordic_bert, and loaded it in using: import transformers as t model_class = t.BertModel tokenizer_class = t.BertTokenizer tokenizer =…

nlp huggingface-transformers huggingface-tokenizers

asked Jul 29 '20 at 11:05

Christian Vennerød

21
4

0

votes

1 answer

Getting started: Huggingface Model Cards

I just recently started looking into the huggingface transformer library. When I tried to get started using the model card code at e.g. community model from transformers import AutoTokenizer, AutoModel tokenizer =…

python pytorch huggingface-transformers huggingface-tokenizers

asked Jun 17 '20 at 16:41

Lukas

61
7

0

votes

1 answer

attention_mask is missing in the returned dict from tokenizer.encode_plus

I have a codebase which was working fine but today when I was trying to run, I observed that tokenizer.encode_plus stopped returning attention_mask. Is it removed in the latest release? Or, do I need to do something else? The following piece of code…

huggingface-transformers huggingface-tokenizers

asked Apr 29 '20 at 22:51

Wasi Ahmad

35,739
32
114
161

-1

votes

1 answer

How to fine tune a model from hugging face?

I want to download a pretrained a model and fine tune the model with my own data. I have downloaded a bert-large-NER model artifacts from hugging face,I have listed the contents below . being new to this, I want to know what files or artifacts do i…

machine-learning amazon-sagemaker huggingface-tokenizers huggingface

asked Sep 03 '22 at 22:10

kyagu

155
2
11

-1

votes

3 answers

Hugging Face: NameError: name 'sentences' is not defined

I am following this tutorial here: https://huggingface.co/transformers/training.html - though, I am coming across an error, and I think the tutorial is missing an import, but i do not know which. These are my current imports: # Transformers…

python bert-language-model huggingface-transformers huggingface-tokenizers huggingface-datasets

asked Jun 14 '21 at 15:00

user16098918

-1

votes

2 answers

How do I prevent a lack of VRAM halfway through training a Huggingface Transformers (Pegasus) model?

I'm taking a pre-trained pegasus model through Huggingface transformers, (specifically, google/pegasus-cnn_dailymail, and I'm using Huggingface transformers through Pytorch) and I want to finetune it on my own data. This is however quite a large…

pytorch huggingface-transformers huggingface-tokenizers

asked Mar 11 '21 at 11:17

Lara

2,594
4
24
36

Questions tagged [huggingface-tokenizers]