Are special tokens [CLS] [SEP] absolutely necessary while fine tuning BERT?

Question

I am following the tutorial https://www.depends-on-the-definition.com/named-entity-recognition-with-bert/ to do Named Entity Recognition with BERT.

While fine-tuning, before feeding the tokens to the model, the author does:

input_ids = pad_sequences([tokenizer.convert_tokens_to_ids(txt) for txt in tokenized_texts],
                          maxlen=MAX_LEN, dtype="long", value=0.0,
                          truncating="post", padding="post")

According to my tests, this doesn't add special tokens to the ids. So am I missing something or i it not always necessary to include [CLS] (101) [SEP] (102)?

Татьяна Дембелова · Accepted Answer · 2020-11-05T12:52:47.910

1

I'm also following this tutorial. It worked for me without adding these tokens, however, I found in another tutorial (https://vamvas.ch/bert-for-ner) that it is better to add them, because the model was trained in this format.

[Update] Actually just checked it, it turned out that the accuracy improved by 20% after adding the tokens. But note that I am using it on a different dataset

edited Nov 05 '20 at 12:52

answered Nov 05 '20 at 12:32

Татьяна Дембелова

26
2

It is safe to add them. Although it might not be needed for NER, the pooler_output of BertModel is based on the first token, which should be CLS. Thanks for a great article. – N G Nov 11 '20 at 21:09

Are special tokens [CLS] [SEP] absolutely necessary while fine tuning BERT?

1 Answers1