2

I have a list, using higgingface bert tokenizer I can get the mapping numerical representation.

X = ['[CLS]', '[MASK]', 'love', 'this', '[SEP]']
tokens = tokenizer.convert_tokens_to_ids(X)
toekns: [101, 103, 2293, 2023, 102]

Is there any function so that I can get tokens=[101, 103, 2293, 2023, 102] to words ['[CLS]', '[MASK]', 'love', 'this', '[SEP]']?

One possible way is to mapping, but is there any defined function to do it easily ?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
kowser66
  • 125
  • 1
  • 8

0 Answers0