why input_ids of huggingface bert is different in every batch size?

Asked Oct 01 '22 at 19:04

Active Oct 01 '22 at 19:04

Viewed 232 times

When I'm using the huggingface transformer to train bert, i see, for every batch the input_ids are different. For example for first batch it is torch.Size([16, 171]) and second batch it is torch.Size([16, 450]). What is the reason?

asked Oct 01 '22 at 19:04

kowser66

It is most likely because the sentences in the batches have different lengths in the number of subwords. Do you have a particular example (=code snippet) that you think does not work as you expect? – Jindřich Oct 03 '22 at 08:21

why input_ids of huggingface bert is different in every batch size?

0 Answers0