0

I wanted to implement simple softmax-based self-attention for a sequence of vectors. Using PyTorch's multi-head self-attention API seems overwhelming for my task with a large number of parameters to train. Is there any API/ simple codebase to do simple self-attention and then do either mean-pool or max-pool to get a single vector in PyTorch?

thedemons
  • 1,139
  • 2
  • 9
  • 25

0 Answers0