Hugging Face: NameError: name 'sentences' is not defined

Question

I am following this tutorial here: https://huggingface.co/transformers/training.html - though, I am coming across an error, and I think the tutorial is missing an import, but i do not know which.

These are my current imports:

# Transformers installation
! pip install transformers
# To install from source instead of the last release, comment the command above and uncomment the following one.
# ! pip install git+https://github.com/huggingface/transformers.git

! pip install datasets transformers

from transformers import pipeline

Current code:

from datasets import load_dataset

raw_datasets = load_dataset("imdb")

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")

inputs = tokenizer(sentences, padding="max_length", truncation=True)

The error:

NameError                                 Traceback (most recent call last)

<ipython-input-9-5a234f114e2e> in <module>()
----> 1 inputs = tokenizer(sentences, padding="max_length", truncation=True)

NameError: name 'sentences' is not defined

score 2 · Answer 1 · answered Nov 12 '21 at 18:19

2

This error is because you have not declared sentences. Now you need to access raw data using:

k = raw_datasets['train']
sentences = k['text']

answered Nov 12 '21 at 18:19

Shivani Gowda

31
1
3

score 1 · Answer 2 · answered Jun 14 '21 at 15:16

1

create a variable

sentences = ["Hello I'm a single sentence",
             "And another sentence",
             "And the very very last one"]

"As we saw in Preprocessing data, we can prepare the text inputs for the model with the following command (this is an example, not a command you can execute)"

answered Jun 14 '21 at 15:16

darylvickerman

572
5
12

Hello - thanks for the reply. What exactly are these sentences used for? – Jun 14 '21 at 15:30
Hiya, these sentences are used on a different tutorial from the website youre using. In the quote above, i got it from the tutorial youre currently following. So just showing you what you should add(an example) to get rid of the error – darylvickerman Jun 14 '21 at 16:02

Ivaylo Strandjev · Accepted Answer · 2021-06-14T15:08:55.950

0

The error states that you do not have a variable called sentences in the scope. I believe the tutorial presumes you already have a list of sentences and are tokenizing it.

Have a look at the documentation The first argument can be either a string or list of string or list of list of strings.

__call__(text: Union[str, List[str], List[List[str]]],...)

edited Jun 14 '21 at 15:08

answered Jun 14 '21 at 15:02

Ivaylo Strandjev

69,226
18
123
176

Oh, that makes sense. Thanks for that - I will take a look at it. Also, is there a way to do something like this using my own dataset? Here: https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment?text=I+like+you.+I+love+you – Jun 14 '21 at 15:09
1

You can pass anything that you have parsed as one of the types I described so yes, of course. You just need to write a little code to convert it to one of these types – Ivaylo Strandjev Jun 14 '21 at 15:12
Okay, thanks. Therefore, is it possible to use my dataset to classify the sentiment of a sentence being either "happy", "sad" or "angry?" – Jun 14 '21 at 15:28

Hugging Face: NameError: name 'sentences' is not defined

3 Answers3