Highest Voted 'attention-model' Questions

9

votes

2 answers

Outputting attention for bert-base-uncased with huggingface/transformers (torch)

I was following a paper on BERT-based lexical substitution (specifically trying to implement equation (2) - if someone has already implemented the whole paper that would also be great). Thus, I wanted to obtain both the last hidden layers (only…

asked Feb 07 '20 at 20:46

Björn

644
10
23

8

votes

1 answer

MultiHeadAttention attention_mask [Keras, Tensorflow] example

I am struggling to mask my input for the MultiHeadAttention Layer. I am using the Transformer Block from Keras documentation with self-attention. I could not find any example code online so far and would appreciate if someone could give me a code…

tensorflow machine-learning keras transformer-model attention-model

asked Jun 02 '21 at 12:29

R. Giskard

91
1
5

8

votes

1 answer

Different `grad_fn` for similar looking operations in Pytorch (1.0)

I am working on an attention model, and before running the final model, I was going through the tensor shapes which flow through the code. I have an operation where I need to reshape the tensor. The tensor is of the shape torch.Size([[30, 8, 9,…

python pytorch attention-model

asked Apr 24 '19 at 17:28

abkds

1,764
7
27
43

7

votes

1 answer

Inputs to the nn.MultiheadAttention?

I have n-vectors which need to be influenced by each other and output n vectors with same dimensionality d. I believe this is what torch.nn.MultiheadAttention does. But the forward function expects query, key and value as inputs. According to this…

python deep-learning pytorch attention-model

asked Jan 09 '21 at 12:51

angryweasel

316
2
10

7

votes

2 answers

Sequence to Sequence - for time series prediction

I've tried to build a sequence to sequence model to predict a sensor signal over time based on its first few inputs (see figure below) The model works OK, but I want to 'spice things up' and try to add an attention layer between the two LSTM…

tensorflow machine-learning keras attention-model sequence-to-sequence

asked May 12 '20 at 16:56

Roni Gadot

437
2
19
30

7

votes

0 answers

Implementing attention in Keras classification

I would like to implement attention to a trained image classification CNN model. For example, there are 30 classes and with the Keras CNN, I obtain for each image the predicted class. However, to visualize the important features/locations of the…

python tensorflow keras recurrent-neural-network attention-model

asked Jul 16 '19 at 14:15

TheJokerAEZ

361
1
3
9

7

votes

2 answers

How to visualize attention weights?

Using this implementation I have included attention to my RNN (which classify the input sequences into two classes) as follows. visible = Input(shape=(250,)) embed=Embedding(vocab_size,100)(visible) activations= keras.layers.GRU(250,…

keras deep-learning nlp recurrent-neural-network attention-model

asked Dec 20 '18 at 11:00

Stupid420

1,347
3
19
44

6

votes

1 answer

Keras, model trains successfully but generating predictions gives ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor

I created a Seq2Seq model for text summarization. I have two models, one with attention and one without. The one without attention was able to generate predictions but I can't do it for the one with attention even though it fits successfully. This…

python tensorflow keras attention-model seq2seq

asked Jul 19 '21 at 17:35

BlueMango

463
7
21

6

votes

0 answers

Getting error while converting a code in tf1 to tf2

Where the values are rnn_size: 512 batch_size: 128 rnn_inputs: Tensor("embedding_lookup/Identity_1:0", shape=(?, ?, 128), dtype=float32) sequence_length: Tensor("inputs_length:0", shape=(?,), dtype=int32) cell_fw:…

python tensorflow keras tensorflow2.0 attention-model

asked Jun 27 '21 at 16:01

Args

73
5

6

votes

0 answers

Where should we put attention in an autoencoder?

In this tutorial in tensorflow site we can see a code for the implementation of an autoencoder which it's Decoder is as follows: class Decoder(tf.keras.Model): def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz): super(Decoder,…

python tensorflow pytorch attention-model

asked Dec 05 '20 at 11:55

Marzi Heidari

2,660
4
25
57

6

votes

2 answers

How can I add tf.keras.layers.AdditiveAttention in my model?

I am working on a machine language translation problem. The Model I am using is: Model = Sequential([ Embedding(english_vocab_size, 256, input_length=english_max_len, mask_zero=True), LSTM(256, activation='relu'), …

python machine-learning keras deep-learning attention-model

asked Oct 11 '20 at 07:30

user14349917

6

votes

1 answer

Implementing Luong Attention in PyTorch

I am trying to implement the attention described in Luong et al. 2015 in PyTorch myself, but I couldn't get it work. Below is my code, I am only interested in the "general" attention case for now. I wonder if I am missing any obvious error. It runs,…

pytorch attention-model seq2seq

asked May 28 '18 at 18:41

zyxue

7,904
5
48
74

6

votes

1 answer

How to add attention layer to seq2seq model on Keras

Based on this article, I wrote this…

nlp deep-learning keras lstm attention-model

asked Nov 08 '17 at 09:25

Osm

81
4

6

votes

0 answers

Attention in Tensorflow (tf.contrib.rnn.AttentionCellWrapper)

How exactly is tf.contrib.rnn.AttentionCellWrapper used? Can someone give a piece of example code? Specifically, I only managed to make the following fwd_cell =…

tensorflow attention-model

asked May 25 '17 at 01:41

user3373273

61
1
3

6

votes

0 answers

How to load a matrix to change the attention layer in seqToseq demo? - Paddle

While attempting to replicate the section 3.1 in Incorporating Discrete Translation Lexicons into Neural MT in paddle-paddle I tried to have a static matrix that I'll need to load into the seqToseq training pipeline, e.g.: >>> import numpy as np >>>…

python matrix deep-learning paddle-paddle attention-model

asked Oct 17 '16 at 05:57

alvas

115,346
109
446
738

Questions tagged [attention-model]