Use multiple softmax in transformers output layer and calculate loss

Asked Dec 22 '20 at 18:32

Active Dec 22 '20 at 18:32

Viewed 198 times

Can I use multiple softmax in the last output layer in transformers? If so, how can I calculate loss from that. I am working in pytorch.

And I am asking because my data is a sequence of tuples where, the elements have different dimensions. Like,

[(2,1), (3,1), (3,1), (2,1), (2,1), (3,1), (3,0), (4,1)]

The first element of tuples has a vocab of 5 and the second element of tuples has a vocab of 2.

asked Dec 22 '20 at 18:32

afsana mimi

0 Answers0