Highest Voted 'attention-model' Questions

3

votes

1 answer

Why does Keras not return the full sequence of cell state in lstm layer?

I am trying to implement an attention mechanism where I need the full sequence of the cell state (just like the full sequence of the hidden state). Keras LSTM however only returns the last cell state: output, state_h, state_c = layers.LSTM(units=45,…

asked Sep 22 '20 at 12:24

bcsta

1,963
3
22
61

3

votes

2 answers

What should be the Query Q, Key K and Value V vectors/matrics in torch.nn.MultiheadAttention?

Following an amazing blog, I implemented my own self-attention module. However, I found PyTorch has already implemented a multi-head attention module. The input to the forward pass of the MultiheadAttention module includes Q (which is query vector)…

pytorch attention-model

asked Aug 04 '20 at 14:22

PinkBanter

1,686
5
17
38

3

votes

1 answer

Should the queries, keys and values of the transformer be split before or after being passed through the linear layers?

I have seen two different implementations of Multi-Head Attention. In one of the approaches the queries, keys and values are split into heads before being passed through the linear layers as shown below: def split_heads(self, x, batch_size): …

deep-learning nlp pytorch transformer-model attention-model

asked Jul 30 '20 at 17:48

Kinyugo

429
1
4
11

3

votes

1 answer

Loading pre trained Attention model in keras custom_objects

I am loading a pretrained attention model in Keras using load_model() . My Attention class is defined as below. # attention class from keras.engine.topology import Layer from keras import initializers, regularizers, constraints from keras import…

python keras pre-trained-model attention-model

asked Jul 09 '20 at 08:46

der_radler

549
1
6
17

3

votes

2 answers

What do input layers represent in a Hierarchical Attention Network

I'm trying to grasp the idea of a Hierarchical Attention Network (HAN), most of the code i find online is more or less similar to the one here: https://medium.com/jatana/report-on-text-classification-using-cnn-rnn-han-f0e887214d5f…

python machine-learning keras nlp attention-model

asked Apr 06 '19 at 04:21

amrnablus

237
1
3
12

3

votes

0 answers

Implementing a simple attention mechanism in Keras

I want to implement a simple attention mechanism to ensemble the results of a CNN model. Concretely, each example of my input is a sequences of images, so each example has shape [None, img_width, img_height, n_channels]. Using a TimeDistributed…

python tensorflow keras attention-model

asked Mar 17 '19 at 10:04

Jsevillamol

2,425
2
23
46

3

votes

1 answer

Pytorch softmax along different masks without for loop

Say I have a vector a , with an index vector b of the same length. The indexs are in range 0~N-1, corresponding to N groups. How can I do softmax for every group without for loop? I'm doing some sort of attention operation here. The numbers for…

parallel-processing deep-learning pytorch softmax attention-model

asked Jan 21 '19 at 05:46

Zhang Yu

559
6
15

3

votes

0 answers

Exhaustive Concatenation between the tensors

I am trying to do the exhaustive concatenation between the tensors. So, for example, I have tensor: a = torch.randn(3, 512) I want to concatenate like concat(t1,t1),concat(t1,t2), concat(t1,t3), concat(t2,t1), concat(t2,t2).... As a naive…

python deep-learning pytorch attention-model

asked Jan 14 '19 at 15:51

amy

342
1
5
18

3

votes

1 answer

Self-Attention GAN in Keras

I'm currently considering to implement the Self-Attention GAN in keras. The way I'm thinking to implement is as follows: def Attention(X, channels): def hw_flatten(x): return np.reshape(x, (x.shape[0], -1, x.shape[-1])) f =…

tensorflow keras conv-neural-network attention-model generative-adversarial-network

asked Jun 12 '18 at 14:43

Hao Chen

174
1
4
13

3

votes

2 answers

LSTM with Attention

I am trying to add attention mechanism to stacked LSTMs implementation https://github.com/salesforce/awd-lstm-lm All examples online use encoder-decoder architecture, which I do not want to use (do I have to for the attention mechanism?). Basically,…

neural-network deep-learning pytorch tensor attention-model

asked Mar 03 '18 at 16:07

Boris Mocialov

3,439
2
28
55

3

votes

1 answer

What does the "source hidden state" refer to in the Attention Mechanism?

The attention weights are computed as: I want to know what the h_s refers to. In the tensorflow code, the encoder RNN returns a tuple: encoder_outputs, encoder_state = tf.nn.dynamic_rnn(...) As I think, the h_s should be the encoder_state, but the…

machine-learning nlp deep-learning sequence-to-sequence attention-model

asked Jan 23 '18 at 04:01

imhuay

271
1
2
11

3

votes

1 answer

How to use the output of attention wrapper applied over LSTM as an input to the TimeDistributed layer, Keras?

I have been trying to implement an attention wrapper over the output of the LSTM model shown in this machinelearningmastery tutorial: from numpy import array from keras.models import Sequential from keras.layers import Dense from keras.layers import…

python keras lstm valueerror attention-model

asked Dec 02 '17 at 13:13

Saurav--

1,530
2
15
33

3

votes

1 answer

How to modify the Tensorflow Sequence2Sequence model to implement Bidirectional LSTM rather than Unidirectional one?

Refer to this post to know the background of the problem: Does the TensorFlow embedding_attention_seq2seq method implement a bidirectional RNN Encoder by default? I am working on the same model, and want to replace the unidirectional LSTM layer with…

tensorflow nlp lstm sequence-to-sequence attention-model

asked Jul 14 '17 at 22:51

Leena Shekhar

31
3

3

votes

1 answer

Attention mechanism for sequence classification (seq2seq tensorflow r1.1)

I am trying to build a bidirectional RNN with attention mechanism for sequence classification. I am having some issues understanding the helper function. I have seen that the one used for training needs the decoder inputs, but as I want a single…

tensorflow classification sequence recurrent-neural-network attention-model

asked Apr 27 '17 at 11:49

JJChickpeaboy

55
1
5

2

votes

0 answers

In the sequential recommendation model TiSASRec, the results of the baseline model SASRec are inconsisent with the actual?

I am a novice in the recommender system. Recently, I was reading a paper related to sequential recommendation. In the process of running the official sample code of TiSASRec, I used the dataset given in github repo by removing the ratings and…

huggingface-transformers recommendation-engine attention-model

asked Mar 17 '23 at 14:58

jie Zhou

21
2

Questions tagged [attention-model]