Apparently, BERT tokenizer returns input_ids
, attention_mask
and token_type_ids
,
where token_type_ids are used if we have NSP or QA task (it is used to differentiate between the Context and the aspect (question, next sentence, and so on)).
In my case, I want to use Bertweet (https://huggingface.co/vinai/bertweet-base), and it DOES NOT return token_type_ids. Is it possible in your opinion to try to do that without the token_type_ids, or there is no chance the model will learn?