Questions tagged [attention-model]

Questions regarding attention model mechanism in deep learning

389 questions
0
votes
1 answer

Implementing self attention

I am trying to implement self attention in Pytorch. I need to calculate the following expressions. Similarity function S (2 dimensional), P(2 dimensional), C' S[i][j] = W1 * inp[i] + W2 * inp[j] + W3 * x1[i] * inp[j] P[i][j] = e^(S[i][j]) / Sum for…
0
votes
2 answers

Getting alignment/attention during translation in OpenNMT-py

Does anyone know how to get the alignments weights when translating in Opennmt-py? Usually the only output are the resulting sentences and I have tried to find a debugging flag or similar for the attention weights. So far, I have been unsuccessful.
0
votes
2 answers

How to makeup FSNS dataset with my own image for attention OCR tensorflow model

I want to apply attention-ocr to detect all digits on number board of cars. I've read your README.md of attention_ocr on github(https://github.com/tensorflow/models/tree/master/research/attention_ocr), and also the way I should do to use my own…
0
votes
1 answer

Luong Attention and Bahdanau. When should we use Luong or Bahdanau?

i'm kind new with machine learning concept, especially machine translation. I've read about the Luong's Attention and Bahdanau's Attention. Luong is said to be “multiplicative” while Bahdanau is “additive”. But i still don't know which one is better…
MaybeNextTime
  • 561
  • 5
  • 11
0
votes
1 answer

How are parameters set for the config in attention-based models?

There are a few parameters in the config, particularly when I change the max_len, hidden_size or embedding_size. config = { "max_len": 64, "hidden_size": 64, "vocab_size": vocab_size, "embedding_size": 128, "n_class": 15, …
HumanTorch
  • 349
  • 2
  • 5
  • 16
0
votes
1 answer

Attention in Keras : How to add different attention mechanism in keras Dense layer?

I am new in Keras and I am trying to build a simple autoencoder in keras with attention layers : Here what I tried : data = Input(shape=(w,), dtype=np.float32, name='input_da') noisy_data = Dropout(rate=0.2, name='drop1')(data) encoded =…
Aaditya Ura
  • 12,007
  • 7
  • 50
  • 88
0
votes
1 answer

model size too big with my attention model implementation?

I am implementing Minh-Thang Luong's attention model to build a english to chinese translater.And the model i trained has abnormally big size(980 MB).Minh-Thang Luong's original paper this is model parameters state size:120 source language…
abracadabra
  • 371
  • 2
  • 16
0
votes
1 answer

Backpropagation in Attention Model

I am trying to figure out how to do backpropagation through the scaled dot product attention model. The scaled dot production attention takes Q(Queries),K(Keys),V(Values) as inputs and performs the following operation: Attention(Q,K,V ) =…
cherry13
  • 11
  • 3
0
votes
0 answers

ValueError: Dimensions must be equal, but are 49152 and 64 for ‘Attention_0/add' (op: 'Add')

I want to try to replace the contents of the encode and decoder in this github code (i.e.,in dcrnn_model.py line 83) with the encoder and attention decoder. These are the code before the encoder-decoder: max_diffusion_step =…
0
votes
1 answer

Keras how to add an attention layer for a weighted sum

I have the following a network architecture (only the relevant part of the network is shown below) vocab_dimension = 1500 embed_dimension = 10 x = [Input(shape=(None, ), name='input', dtype='int32'), Input(shape=(None, ), name='weights'), …
Brian
  • 13,996
  • 19
  • 70
  • 94
0
votes
0 answers

How can I efficiently extract sub-images from a batch of images with each sub-image at a different location?

I need to extract (m x m) sub-images from a batch of (n x n) images, where: images.shape = (batch, n, n, n_channels), sub-images.shape = (batch, m, m, n_channels), and where each sub-image is in a different location for each image in the batch.…
John J
  • 47
  • 2
  • 6
0
votes
0 answers

How to implement attention for sequence to sequence model in keras. Please explain step by step

How to implement attention for a sequence to sequence model in keras. I understand this seq2seq model, but I want to do attention with Fig B (shown in the attached link seq2seq). Please explain step by step.
Mr.Beans
  • 1
  • 2
0
votes
1 answer

PyTorch: How to implement attention for graph attention layer

I have implemented the attention (Eq. 1) of https://arxiv.org/pdf/1710.10903.pdf but it's clearly not memory efficient and can run only a single model on my GPU (it takes 7-10GB). Currently, I have class MyModule(nn.Module): def __init__(self,…
XogoX
  • 139
  • 3
  • 14
0
votes
1 answer

Word2Vec Doesn't Contain Embedding for Number 23

Hi I am on the course of developing Encoder-Decoder model with Attention which predicts WTO Panel Report for the given Factual Relation given as Text_Inputs. Sample_sentence for factual relation is as follow: sample_sentence = "On 23 January 1995,…
snapper
  • 997
  • 1
  • 12
  • 15
0
votes
1 answer

How to perform row wise or column wise max pooling in keras

I am trying to perform row wise and column wise max pooling over an attention layer as described in the link below: http://www.dfki.de/~neumann/ML4QAseminar2016/presentations/Attentive-Pooling-Network.pdf (slide-15) I am using text dataset where a…
Purbasha
  • 43
  • 9
1 2 3
25
26