Highest Voted 'attention-model' Questions

2

votes

1 answer

Some parameters are not getting saved when saving a model in pytorch

I have built an encoder-decoder model with attention for morph inflection generation. I am able to train the model and predict on test data but I am getting wrong predicting after loading a saved model I am not getting any error during saving or…

pytorch attention-model encoder-decoder

asked May 02 '19 at 18:20

Umang Jain

21
5

2

votes

1 answer

Attention layer output shape issue

I have been using BiLSTMs to classify each word in sentences and my input is n_sentences, max_sequence_length, classes. Recently, I have been trying to use this attention layer:…

tensorflow keras lstm bidirectional attention-model

asked Apr 23 '19 at 14:35

D. Clem

85
1
6

2

votes

0 answers

Graph disconnect in inference in Keras RNN + Encoder/Decoder + Attention

I've successfully trained a model in Keras using an encoder/decoder structure + attention + glove following several examples, most notably this one and this one. It's based on a modification of machine translation. This is a chatbot, so the input is…

keras recurrent-neural-network inference attention-model encoder-decoder

asked Mar 14 '19 at 02:11

a1orona

21
2

2

votes

2 answers

Is attention mechanism really attention or just looking back at memory again?

When reading attention mechanism, I am confusing about the term attention. Is it the same with our attention nature as described in it usual definition?

deep-learning attention-model

asked Mar 03 '19 at 02:08

Giang Nguyen

450
8
17

2

votes

2 answers

What is used to train a self-attention mechanism?

I've been trying to understand self-attention, but everything I found doesn't explain the concept on a high level very well. Let's say we use self-attention in a NLP task, so our input is a sentence. Then self-attention can be used to measure how…

machine-learning nlp artificial-intelligence attention-model

asked Nov 06 '18 at 13:05

DivideByCucumberError

31
2

2

votes

4 answers

Keras: How to display attention weights in LSTM model

I made a text classification model using an LSTM with attention layer. I did my model well, it works well, but I can't display the attention weights and the importance/attention of each word in a review (the input text). The code used for this model…

python keras lstm text-classification attention-model

asked Sep 03 '18 at 14:50

Okorimi Manoury

114
1
11

2

votes

1 answer

How to get attention weights in hierarchical model

Model : sequence_input = Input(shape=(MAX_SENT_LENGTH,), dtype='int32') words = embedding_layer(sequence_input) h_words = Bidirectional(GRU(200, return_sequences=True,dropout=0.2,recurrent_dropout=0.2))(words) sentence = Attention()(h_words) #with…

python keras attention-model

asked Jul 09 '18 at 06:14

Rohit Saxena

31
4

2

votes

1 answer

Attention on top of LSTM Keras

I was training an LSTM Model using Keras and wanted to add Attention on top of it. I am new to Keras, and Attention. From link How to add an attention mechanism in keras? I learnt how I could add attention over my LSTM Layer and made a model like…

tensorflow keras deep-learning lstm attention-model

asked May 28 '18 at 10:38

hiteshn97

90
2
9

2

votes

1 answer

How to use previous output and hidden states from LSTM for the attention mechanism?

I am currently trying to code the attention mechanism from this paper: "Effective Approaches to Attention-based Neural Machine Translation", Luong, Pham, Manning (2015). (I use global attention with the dot score). However, I am unsure on how to…

tensorflow machine-learning lstm recurrent-neural-network attention-model

asked Feb 06 '18 at 23:15

Tom

275
2
16

2

votes

0 answers

Using Attention-OCR model (tensorflow/research) for extracting specific information from scanned documents

I have a few questions regarding the Attention-OCR model described in this paper: https://arxiv.org/pdf/1704.03549.pdf Some context My goal is to let Attention-OCR learn where to look for and read a specific information in a scanned document. It…

python tensorflow attention-model

asked Nov 09 '17 at 07:33

Filip Dziuba

21
1
6

2

votes

1 answer

Multiple issues with axes while implementing a Seq2Seq with attention in CNTK

I'm trying to implement a Seq2Seq model with attention in CNTK, something very similar to CNTK Tutorial 204. However, several small differences lead to various issues and error messages, which I don't understand. There are many questions here, which…

python cntk sequence-to-sequence attention-model

asked Sep 12 '17 at 06:54

Skiminok

2,801
1
24
29

2

votes

0 answers

Keras ValueError: expected ndim=3, found ndim=4

In my Keras model, I am using a TimeDistributed wrapper, but I keep getting a shape mismatch error. Here are the layers: r_input = Input(shape=(100,), dtype='int32') embedded_sequences = embedding_layer(r_input) r_lstm = Bidirectional(GRU(100,…

python deep-learning keras keras-layer attention-model

asked Aug 07 '17 at 15:26

bear

663
1
14
33

1

vote

1 answer

How to replace this naive code with scaled_dot_product_attention() in Pytorch?

Consider a code fragment from Crossformer: def forward(self, queries, keys, values): B, L, H, E = queries.shape _, S, _, D = values.shape scale = self.scale or 1./sqrt(E) scores = torch.einsum("blhe,bshe->bhls", queries, keys) A…

python deep-learning pytorch tensor attention-model

asked Aug 16 '23 at 12:31

Serge Rogatch

13,865
7
86
158

1

vote

0 answers

Hugging Face translation model cross attention layers problem, inconsistent with research

When I'm inspecting the cross-attention layers from the pretrained transformer translation model (MarianMT model), It is very strange that the cross attention from layer 0 and 1 provide best alignment between input and output. I used bertviz to…

huggingface-transformers transformer-model text-alignment attention-model machine-translation

asked Apr 03 '23 at 18:27

Ayw

11
3

1

vote

0 answers

Questions about masks of padding in GPT

The GPT series models use the decoder of Transformer, with unidirectional attention. In the source code of GPT in Hugging Face, there is the implementation of masked attention: self.register_buffer( "bias", …

huggingface-transformers attention-model gpt-2 zero-padding

asked Apr 01 '23 at 11:01

LocustNymph

11
3

Questions tagged [attention-model]