I want to ask you about the structure of "query, key, value" of "transformer"

Asked Jan 18 '22 at 03:10

Active Apr 15 '22 at 08:59

Viewed 230 times

I'm a beginner at NLP. So I'm trying to reproduce the most basic transformer all you need code.

But I got a question while doing it.

In the MultiHeadAttention layer, I printed out the shape of "query, key, value". However, the different shapes of "query" and "key, value" were printed. "self-attention" eventually finds a correlation with oneself, which is different".I don't understand the shape of "query, key, value".

enter image description here The value of "query, key, value" comes from src, but why are the values different? enter image description here

enter image description here

I brought the code from here.

https://github.com/ndb796/Deep-Learning-Paper-Review-and-Practice/blob/master/code_practices/Attention_is_All_You_Need_Tutorial_(German_English).ipynb

edited Jan 18 '22 at 08:10

asked Jan 18 '22 at 03:10

I highly recommend everyone to read this blog for details of the transformer if you haven't. https://jalammar.github.io/illustrated-transformer/ – joe32140 Jan 18 '22 at 05:34
somebody help me – Jan 18 '22 at 06:01

I want to ask you about the structure of "query, key, value" of "transformer"

0 Answers0