Highest Voted 'attention-model' Questions

4

votes

0 answers

Self-Attention Explainability of the Output Score Matrix

I am learning about attention models, and following along with Jay Alammar's amazing blog tutorial on The Illustrated Transformer. He gives a great walkthrough for how the attention scores are calculated, but I get a bit lost at a certain point, and…

asked Sep 04 '19 at 17:54

Yu Chen

6,540
6
51
86

4

votes

0 answers

HAN ValueError: Unknown layer: AttentionWithContext in getting deep copy of model

I am fitting HAN model in my data. And Want to save all the models in each iteration. For this purpose I am making a list of model at each iteration. And getting following error while deep copying the model. ValueError: Unknown layer:…

python tensorflow deep-learning attention-model

asked Aug 09 '19 at 07:09

Sadaf

79
2

4

votes

1 answer

Hierarchical Attention Network - model.fit generates error 'ValueError: Input dimension mis-match'

For background, I am referring to the Hierarchical Attention Network used for sentiment classification. For code: my full code is posted below, but it is just simple revision of the original code posted by the author on the link above. And I…

python keras neural-network deep-learning attention-model

asked Mar 03 '19 at 09:04

Ziqi

2,445
5
38
65

4

votes

1 answer

How can I pre-compute a mask for each input and adjust the weights according to this mask?

I want to provide a mask, the same size as the input image and adjust the weights learned from the image according to this mask (similar to attention, but pre-computed for each image input). How can I do this with Keras (or TensorFlow)?

tensorflow keras conv-neural-network attention-model

asked Feb 27 '19 at 13:43

dusa

840
3
14
31

4

votes

1 answer

why softmax get small gradient when the value is large in paper 'Attention is all you need'

This is the screen of the original paper: the screen of the paper. I understand the meaning of the paper is that when the value of dot-product is large, the gradient of softmax will get very small. However, I tried to calculate the gradient of…

deep-learning nlp softmax attention-model

asked Feb 27 '19 at 12:42

Richard. Zhu

63
8

4

votes

1 answer

Transformer - Attention is all you need - encoder decoder cross attention

It is my understanding that each encoder block takes the output from the previous encoder, and that the output is the attended representation (Z) of the sequence (aka sentence). My question is, how does the last encoder block produce K, V from Z…

deep-learning nlp attention-model

asked Feb 04 '19 at 04:50

Vincent CS Chow

41
2

4

votes

1 answer

AttentionDecoderRNN without MAX_LENGTH

From the PyTorch Seq2Seq tutorial, http://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html#attention-decoder We see that the attention mechanism is heavily reliant on the MAX_LENGTH parameter to determine the output dimensions of…

recurrent-neural-network pytorch machine-translation sequence-to-sequence attention-model

asked Feb 09 '18 at 04:06

alvas

115,346
109
446
738

3

votes

0 answers

Segmentation fault (core dumped) with libpatches.so

Edit3: Loaded core into gdb. Edit2: Included the .cc code. Edit1: loaded debug symbols. I'm trying to run the example mnist program of the attention-sampling github library. The error out put is as…

python c++ gdb attention-model .so

asked Mar 25 '23 at 03:40

Pristina Wang

31
4

3

votes

0 answers

Implementing 1D self attention in PyTorch

I'm trying to implement the 1D self-attention block below using PyTorch: proposed in the following paper. Below you can find my (provisional) attempt: import torch.nn as nn import torch #INPUT shape ((B), CH, H, W) class…

pytorch attention-model self-attention

asked Mar 21 '22 at 16:07

James Arten

523
5
16

3

votes

0 answers

How to use nn.MultiheadAttention together with nn.LSTM?

I'm trying to build a Pytorch network for image captioning. Currently I have a working network of Encoder and Decoder, and I want to add nn.MultiheadAttnetion layer to it (to be used as self attention). Currently my decode looks like this: class…

neural-network pytorch lstm recurrent-neural-network attention-model

asked Jan 03 '22 at 18:27

Shir

1,157
13
35

3

votes

1 answer

Implementing custom learning rate scheduler in Pytorch?

I would like to implement this learning rate method as in the paper Attention is all you need. I have this code in Tensorflow, but I would like to implement it in Pytorch too. I know that Pytorch has modules for this…

tensorflow pytorch transformer-model attention-model

asked Oct 14 '21 at 20:01

Dametime

581
1
6
23

3

votes

1 answer

attn_output_weights in MultiheadAttention

I wanna know if the matrix of the attn_output_weight can demonstrate the relationship between every word-pair in the input sequence. In my project, I draw the heat map based on this output and it shows like this: However, I can hardly see any…

python deep-learning nlp pytorch attention-model

asked Apr 27 '21 at 03:54

Yuki Wang

85
8

3

votes

1 answer

Retrieve the "relevant tokens" with a BERT model (already fine-tuned)

I already fine-tuned a BERT model ( with the huggingface library) for a classification task to predict a post category in two types (1 and 0, for example). But, I would need to retrieve the "relevant tokens" for the documents that are predicted as…

keyword bert-language-model huggingface-transformers attention-model

asked Mar 29 '21 at 19:55

Nicolas Montes

31
1

3

votes

1 answer

Number of learnable parameters of MultiheadAttention

While testing (using PyTorch's MultiheadAttention), I noticed that increasing or decreasing the number of heads of the multi-head attention does not change the total number of learnable parameters of my model. Is this behavior correct? And if so,…

python python-3.x nlp pytorch attention-model

asked Feb 12 '21 at 12:31

Elidor00

1,271
13
27

3

votes

1 answer

AttributeError: can't set attribute. Hierarchical Attentional Network

When I am defining the Hierarchical Attentional Network, an error is popping up which says "AttributeError: can't set attribute". Please help. This is the Attention.py file import keras import Attention from keras.engine.topology import Layer,…

python tensorflow deep-learning attention-model

asked Oct 01 '20 at 19:26

Akansha Gautam

77
1
10

Questions tagged [attention-model]