Highest Voted 'huggingface-datasets' Questions

0

votes

1 answer

ValueError: Please pass `features` or at least one example when writing data

I'm new to huggingface and am working on a movie generation script. So far my code looks like this from transformers import GPT2Tokenizer, GPTNeoModel from datasets import load_dataset dataset =…

python huggingface-datasets

asked Sep 06 '21 at 19:31

Ulto 4

368
4
16

0

votes

1 answer

Key error when feeding the training corpus to the train_new_from_iterator method

I am following this tutorial here: https://github.com/huggingface/notebooks/blob/master/examples/tokenizer_training.ipynb So, using this code, I add my custom dataset: from datasets import load_dataset dataset = load_dataset('csv',…

python bert-language-model huggingface-transformers huggingface-tokenizers huggingface-datasets

asked Aug 10 '21 at 12:07

user16098918

0

votes

1 answer

Setting `remove_unused_columns=False` causes error in HuggingFace Trainer class

I am training a model using HuggingFace Trainer class. The following code does a decent job: !pip install datasets !pip install transformers from datasets import load_dataset from transformers import AutoModelForSequenceClassification,…

pytorch huggingface-transformers huggingface-tokenizers huggingface-datasets

asked Jul 28 '21 at 08:35

Hossein

2,041
1
16
29

0

votes

2 answers

how to use deberta model from hugging face and use .compile() and . summary() with it

I used this code to load weights from transformers import DebertaTokenizer, DebertaModel import torch tokenizer = DebertaTokenizer.from_pretrained('microsoft/deberta-base') model = DebertaModel.from_pretrained('microsoft/deberta-base') after that…

python nlp bert-language-model huggingface-transformers huggingface-datasets

asked Jul 14 '21 at 08:39

Shorouk Adel

127
3
20

0

votes

2 answers

Problem with batch_encode_plus method of tokenizer

I am encountering a strange issue in the batch_encode_plus method of the tokenizers. I have recently switched from transformer version 3.3.0 to 4.5.1. (I am creating my databunch for NER). I have 2 sentences whom I need to encode, and I have a case…

python pytorch huggingface-transformers huggingface-tokenizers huggingface-datasets

asked Jun 24 '21 at 09:26

Anurag Sharma

4,839
13
59
101

0

votes

0 answers

ValueError: Input is not valid. Should be a string, a list/tuple of strings or a list/tuple of integers

from os import listdir from os.path import isfile, join from datasets import load_dataset from transformers import BertTokenizer test_files = [join('./test/', f) for f in listdir('./test') if isfile(join('./test', f))] dataset =…

huggingface-transformers huggingface-datasets

asked May 27 '21 at 11:53

Michael

19
6

0

votes

1 answer

KeyError: "None of ['index'] are in the columns"

Here is a json file : { "id": "68af48116a252820a1e103727003d1087cb21a32", "article": [ "by mark duell .", "published : .", "05:58 est , 10 september 2012 .", "| .", "updated : .", "07:38 est ,…

python pandas huggingface-datasets

asked May 26 '21 at 13:56

Michael

19
6

0

votes

2 answers

File name too long

In a local repository, I have several json files. When I run the command from datasets import load_dataset dataset = load_dataset('json', data_files=['./100009.json']) I got the following error: OSError: [Errno 36] File name too long:…

dataset huggingface-datasets

asked May 26 '21 at 11:44

Michael

19
6

-1

votes

1 answer

Create DataFrame from Object HuggingFace

I recently download a dataset from HuggingFace HuggingFace. I've used datasets.Dataset.load_dataset() and it gives me a Dataset backed by an Apache Arrow table. So I have problems to export the data into a DataFrame to work with pandas. The…

python-3.x pandas dataframe apache-arrow huggingface-datasets

asked Mar 30 '23 at 12:44

M.og.op.gpt

1
1

-1

votes

1 answer

Creating a function on Digital Ocean for hugging face

Hugging face provides transforms and models that allows AL/ML processing offline - https://huggingface.co/ We currently use Digital Ocean and I would like to unload our ML onto DO functions. I know AWS does this already with a few AWS…

lambda digital-ocean huggingface-transformers huggingface huggingface-datasets

asked Feb 02 '23 at 02:38

RodgerThat

19
1
4

-1

votes

3 answers

Hugging Face: NameError: name 'sentences' is not defined

I am following this tutorial here: https://huggingface.co/transformers/training.html - though, I am coming across an error, and I think the tutorial is missing an import, but i do not know which. These are my current imports: # Transformers…

python bert-language-model huggingface-transformers huggingface-tokenizers huggingface-datasets

asked Jun 14 '21 at 15:00

user16098918

Questions tagged [huggingface-datasets]