0
from transformers import TextDatasetForNextSentencePrediction
dataset = TextDatasetForNextSentencePrediction(
tokenizer=bert_cased_tokenizer,
file_path="/path/to/your/dataset",
block_size = 256
)

when I run this code and check all of the next sentence labels, either all next sentences are predicted to be 0 or all are predicted to be 1. According to the paper, the model should return 50% 0's and 50% 1's, but I am receiving 100% of predictions as either 1 or 0. I changed the nsp_probability, but it still shows same issue.

0 Answers0