I'm new to the huggingface library and trying to run a model to do masked language ("fill-mask" task):
from transformers import BertTokenizer, BertForMaskedLM
import torch
from transformers import pipeline, AutoTokenizer, AutoModel
# Initialize MLM pipeline
tokenizer = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
model = AutoModel.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
print(len(tokenizer.vocab))
>>> 28996
But when I'm trying to get the probabilities over the tokens I'm getting an error:
classifier = pipeline("fill-mask", model=model, tokenizer=tokenizer)
results = classifier("Paris is the [MASK] of France.")
>>>KeyError Traceback (most recent call last)
<ipython-input-15-30c429f29424> in <module>()
1 classifier = pipeline("fill-mask", model=model, tokenizer=tokenizer)
----> 2 results = classifier("Paris is the [MASK] of France.")
4 frames
/usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in __getitem__(self, k)
2041 if isinstance(k, str):
2042 inner_dict = {k: v for (k, v) in self.items()}
-> 2043 return inner_dict[k]
2044 else:
2045 return self.to_tuple()[k]
KeyError: 'logits'
I also tried the following from a different tutorial and got the same error:
mlm = pipeline('fill-mask', model=model, tokenizer=tokenizer)
# Get mask token
mask = mlm.tokenizer.mask_token
# Get result for particular masked phrase
phrase = f'Paris is the [MASK] of France.'
result = mlm(phrase, top_k=10000)
# Print result
print(result)