Highest Voted 'attention-model' Questions

2

votes

0 answers

Visualizing self attention weights for sequence addition problem with LSTM?

I am using Self Attention layer from here for a simple problem of adding all the numbers in a sequence that come before a delimiter. With training, I expect the neural network to learn which numbers to add and using Self Attention layer, I expect to…

asked Oct 08 '20 at 06:43

sara_iftikhar

43
3

2

votes

2 answers

assertion failed: [Condition x == y did not hold element-wise:]

I have built a BiLSTM model with an attention layer for sentence classification task but I am getting an error that my assertion has failed due to mismatch in number of parameters. The attention layer code is here and the error is below the…

python-3.x tensorflow keras nlp attention-model

asked Sep 25 '20 at 04:26

PeakyBlinder

1,059
1
14
35

2

votes

1 answer

Why W_q matrix in torch.nn.MultiheadAttention is quadratic

I am trying to implement nn.MultiheadAttention in my network. According to the docs, embed_dim – total dimension of the model. However, according to the source file, embed_dim must be divisible by num_heads and self.q_proj_weight =…

deep-learning nlp pytorch torch attention-model

asked Aug 30 '20 at 13:00

Akim Tsvigun

91
1
8

2

votes

1 answer

Pytorch, get rid of a for loop when adding permutation of one vector to entries of a matrix?

I'm trying to implement this paper, and stuck with this simple step. Although this is to do with attention, the thing I'm stuck with is just how to implement a permutation of a vector added to a matrix without using for loops. The attention scores…

numpy for-loop matrix pytorch attention-model

asked Aug 20 '20 at 13:37

Tinatim

143
6

2

votes

1 answer

Luong Style Attention Mechanism with Dot and General scoring functions in keras and tensorflow

I am trying to implement the dot product and general implementation of calculating similarity scores from encoder and decoder output and hidden states respectively in keras. I have got the idea to do the product of…

deep-learning nlp attention-model

asked Jun 28 '20 at 04:28

Ayush Srivastava

444
1
4
13

2

votes

1 answer

Unable to save model architecture (bilstm + attention)

I am working on a multi-label text classification problem. I am trying to add attention mechanism with bilstm model. The attention mechanism code is taken from here. I am not able to save the model architecture and getting an error mentioned below.…

python tensorflow nlp multilabel-classification attention-model

asked Jun 19 '20 at 13:13

joel

1,156
3
15
42

2

votes

2 answers

Unable to import AttentionLayer in Keras (TF1.13)

I'm trying to import Attention layer for my encoder decoder model but it gives error. from keras.layers import AttentionLayer or from keras.layers import Attention following is the error cannot import name 'AttentionLayer' from…

tensorflow keras attention-model

asked Apr 10 '20 at 12:35

Crossfit_Jesus

53
4
18

2

votes

1 answer

LSTM + Attention Implementation with undefined timestep shape

I'm trying to implement a stacked LSTM with attention with varying timesteps. I mainly based it off of this, this, and this. These implementations, however, assume fixed timesteps. The model runs, but I'm not sure if this is doing what I think…

python tensorflow keras lstm attention-model

asked Mar 17 '20 at 22:47

LogCapy

447
7
20

2

votes

0 answers

Max Sequence length in Seq2Seq - Attention is all you need

I have gone through the paper Attention is all you need and though I think I understood the overall idea behind what is happening, I am pretty confused with the way the input is being processed. Here are my doubts, and for simplicity, let's assume…

nlp transformer-model seq2seq attention-model

asked Nov 02 '19 at 05:34

Kakarot

175
1
3
10

2

votes

1 answer

Why is the input size of the MultiheadAttention in Pytorch Transformer module 1536?

When using the torch.nn.modules.transformer.Transformer module/object, the first layer is the encoder.layers.0.self_attn layer that is a MultiheadAttention layer, i.e. from torch.nn.modules.transformer import Transformer bumblebee =…

pytorch tensor transformer-model attention-model huggingface-transformers

asked Oct 24 '19 at 01:28

alvas

115,346
109
446
738

2

votes

1 answer

How to use loaded LSTM attention model to make predictions on input?

I am a complete beginner in Deep Learning & Keras. I want to build a hierarchical attention network that helps to classify comments into several categories viz. toxic, severely toxic, etc. I took the code from an open repository and saved the model.…

machine-learning keras deep-learning data-science attention-model

asked Oct 10 '19 at 12:17

Code231

21
1

2

votes

0 answers

Tensorflow @tf.function: AttributeError: in converted code

I created a class and defined the train_step function inside it: TF tutorial: NMT_attention Without using the @tf.function significantly increases the training time. On defining it, I get a conversion error for the private variables declared inside…

python-3.x tensorflow tensorflow2.0 attention-model

asked Sep 20 '19 at 18:07

Hackerds

1,195
2
16
34

2

votes

1 answer

How to add an attention layer (along with a Bi-LSTM layer) in keras sequential model?

I am trying to find an easy way to add an attention layer in Keras sequential model. However, I met a lot of problem in achieving that. I am a novice for deep leanring, so I choose Keras as my beginning. My task is build a Bi-LSTM with attention…

python-3.x keras lstm attention-model

asked Aug 10 '19 at 02:20

denglizong

21
1
3

2

votes

1 answer

Combining CNN with attention network

Here is my attention layer class Attention(Layer): def __init__(self, **kwargs): self.init = initializers.get('normal') self.supports_masking = True self.attention_dim = 50 super(Attention,…

keras deep-learning conv-neural-network attention-model

asked Jul 21 '19 at 20:09

Pratik.S

53
6

2

votes

0 answers

PyTorch runtime error: expected argument to have type long, but got CPUType instead

I'm new to PyTorch and going through this tutorial on the transformer model. I'm using PyCharm on Win10. For now, I've basically just copy-pasted the example code, but I'm getting the following error: RuntimeError: Expected tensor for argument #1…

python-3.x pycharm pytorch transformer-model attention-model

asked Jun 03 '19 at 03:41

SmthgScnng

21
1

Questions tagged [attention-model]