Highest Voted 'self-attention' Questions

1

vote

0 answers

Pytorch's Transformer decoder accuracy fluctuation

I have a sequence to sequence POS tagging model which uses Transformer decoder to generate target tokens. My implementation of Pytorch's Transformer decoder is as follows: in the initialization: self.decoder_layer =…

asked Jul 23 '22 at 12:39

Mehrdad

180
1
13

1

vote

2 answers

Visualizing ViT Attention maps after fine tuning on medical dataset

I have imported the Vit-b32 model and fine-tuned it to perform classification task on echo images. Now I want to visualize the attention maps so that I can know on which part of the image the model is focusing for doing the classification task. But…

python deep-learning transformer-model fine-tune self-attention

asked Jul 18 '22 at 17:13

SIDHARTHENEE NAYAK

11
1

1

vote

0 answers

PyTorch Forecasting - Temporal Fusion Transformer calculate_prediction_actual_by_variable() plots empty

referring to the tutorial (https://pytorch-forecasting.readthedocs.io/en/stable/tutorials/stallion.html) provided by Pytorch about their implementation of the Temporal Fusion Transformer, I'm trying to use their…

pytorch pytorch-lightning encoder-decoder self-attention pytorch-forecasting

asked Jun 14 '22 at 13:09

Francesco Mangia

11
2

1

vote

0 answers

one head attention mechanism pytorch

I am trying to implement the attention mechanism using the CIFAR10 dataset. The idea is to implement the attention layer considering only one head. Therefore, I took as reference the multi-head implementation given…

pytorch attention-model self-attention

asked Jun 09 '22 at 22:49

Dew

21
3

1

vote

0 answers

Masked self-attention in tranformer's decoder

I'm writing my thesis about attention mechanisms. In the paragraph in which I explain the decoder of transformer I wrote this: The first sub-layer is called masked self-attention, in which the masking operation consists in preventing the decoder…

transformer-model attention-model encoder-decoder self-attention

asked Jun 08 '22 at 16:12

CarlaDP

11
1

1

vote

0 answers

Diffrent node number in mini_batch

I am fairly new to graph neural networks and I am training a GNN model using self attention and I have a few questions. The question is my node count and node_num differs in each batch such that in the first batch I have: Batch(batch=[1181],…

python pytorch-geometric self-attention

asked Mar 02 '22 at 02:35

林深时

11
2

1

vote

1 answer

Swin Transformer attention maps visualization

I am using a Swin Transformer for a hierarchical problem of multi calss multi label classification. I would like to visualize the self attention maps on my input image trying to extract them from the model, unfortunately I am not succeeding in this…

maps visualization transformer-model explain self-attention

asked Jan 19 '22 at 19:11

Imanuel Rozenberg

11
2

1

vote

1 answer

How to handle tensor multiplication with dimension None

For example I have 2 tensors A and B both with dimension (None, HWC), when I use tf.matmul(tf.transpose(A),B) The result dimension will be (HWC,HWC), this is correct but I want to keep the None dimension so it can be(None, HWC, HWC). Is there…

python tensorflow computer-vision conv-neural-network self-attention

asked Nov 27 '21 at 22:35

Alex Z

347
1
2
13

1

vote

0 answers

How to use multiple heads option in selfAttention class?

I am playing around with Self-attention model from trax library. when I set n_heads=1, everything works fine. But when I set n_heads=2, my code breaks. I use only input activations and one SelfAttention layer. Here is a minimal code: import…

nlp transformer-model attention-model trax self-attention

asked Aug 30 '21 at 23:30

Kenenbek Arzymatov

8,439
19
58
109

0

votes

0 answers

Error in PyTorch: mat1 and mat2 shapes cannot be multiplied

I'm working on a PyTorch project and I want to generate MNIST images using a U-Net architecture combined with a DDPM (Diffusion Models) approach. I'm encountering the following error: encountering the following error: File…

python pytorch conv-neural-network self-attention

asked Aug 28 '23 at 18:13

Zahra Hosseini

478
2
4
14

0

votes

0 answers

How to implement a global self attention with sparse tensor?

Using the following code, I am implementing a global self-attention for sparse input using Minkowski_Engine. I am getting a bit worse result than the model without attention and wonder why this happened. Typically since in the last line of the code…

python attention-model self-attention

asked Aug 25 '23 at 09:06

mrghafari

35
6

0

votes

1 answer

How do I make keras run a Dense layer for each row of an input matrix?

I'm trying to build a basic transformer using keras Attention Layer. For this I need to have 3 different dense layer, each of which generates key,query and value matrices respectively, by running every word embedding through them. But there seems to…

keras self-attention

asked Jul 08 '23 at 15:49

user2741831

2,120
2
22
43

0

votes

0 answers

how to visualize cross-attention maps for checking text-image alignment well?

I was wondering how to visualize cross-attention map of image features a model is looking at given a text query (e.g. sentence). There are some amazing explainable tools ilke Class Activationi Maps, but they are almost needed 'class' or CNN model…

pytorch visualization transformer-model attention-model self-attention

asked Jun 23 '23 at 12:08

meungmeung

1

0

votes

0 answers

WQ, WK, WV matrix used for generating query, key and value vector for Attention in Transformers are fixed or WQ, WK and WV are dependent on input word

To calculate self-attention, For each word, we create a Query vector, a Key vector, and a Value vector. These vectors are created by multiplying the embedding by three matrices that we trained during the training process defined as WQ, WK, WV…

huggingface-transformers attention-model self-attention multihead-attention

asked May 31 '23 at 04:05

Vinay Sharma

319
1
5
13

0

votes

1 answer

Store intermediate values of pytorch module

I try to plot attention maps for ViT. I know that I can do something like h_attn = model.blocks[-1].attn.register_forward_hook(get_activations('attention')) to register a hook that camputres output of some nn.module in my model. The ViT's attention…

pytorch hook transformer-model self-attention

asked May 10 '23 at 10:50

Mitch

27
5

Questions tagged [self-attention]