Highest Voted 'attention-model' Questions

0

votes

1 answer

Why an output of attention decoder need to be combined with attention

legacy_seq2seq in tensorflow x = linear([inp] + attns, input_size, True) # Run the RNN. cell_output, state = cell(x, state) # Run the attention mechanism. if i == 0 and initial_state_attention: with…

python tensorflow attention-model

asked Aug 11 '17 at 08:45

Yutao ZHU

88
5

0

votes

1 answer

Tensorflow attention OCR inference code

I am trying to run the attention ocr in the tensorflow models https://github.com/tensorflow/models/tree/master/attention_ocr . I can find the script for training and evaluating on the FSNS dataset but they do not have code to run inference on a…

tensorflow attention-model

asked Jul 17 '17 at 19:10

Shailesh Acharya

1
3

0

votes

1 answer

Training Method Choice for seq2seq model

What kind of training method you may recommend for training an attention based sequence to sequence neural machine translation model? SGD, Adadelta, Adam or something better? Please give some advice, thanks.

deep-learning sequence-to-sequence attention-model

asked Apr 20 '17 at 05:53

陶恺大天才

295
1
3
7

0

votes

1 answer

Multiply matrix with other matrix of different shapes in keras backend

I'm trying to implement an attention model based this model but I want my model to not just look one frame to decide the attention for that frame, I want a model that will try to look at the frame in respect to the whole sequence. So what I'm doing…

deep-learning keras attention-model

asked Apr 04 '17 at 07:40

m1sk

308
2
10

0

votes

1 answer

Extracting attention matrix with TensorFlow's seq2seq example code during decoding

It seems like the attention() method used to compute the attention mask in the seq2seq_model.py code in the example TensorFlow code for the sequence-to-sequence code is not called during decoding. Does anyone know how to resolve this? A similar…

python tensorflow sequence-to-sequence attention-model

asked Dec 19 '16 at 00:19

EXeLicA

1
1

-1

votes

0 answers

Simple RNN with Attention mechanism vs LSTM without attention mechanism

Simple RNN with Attention mechanism vs LSTM without attention mechanism, who will perform better? In general, it's difficult to definitively say whether a simple RNN with an attention mechanism or an LSTM without an attention mechanism will perform…

machine-learning lstm attention-model

asked Aug 26 '23 at 11:55

Aditya Jindal

1
2

-1

votes

0 answers

Calculating Sentence Level Attention

How do I quantify the attention between input and output sentences in a sequence-to-sequence language modelling scenario [translation or summarization]? For instance, consider these input and output statements, i.e., document is the input, and…

nlp huggingface-transformers attention-model self-attention multihead-attention

asked Aug 19 '23 at 14:21

Parteek Singh Jamwal

1

-1

votes

0 answers

How to use attention mechanism to learn four weights

I am a beginner in graph neural networks and I want to use attention mechanism to learn weights for four results, so that they can be weighted and summed to obtain the final result Expect to achieve attention class, learn four weights, and weighted…

deep-learning attention-model

asked Aug 07 '23 at 06:13

sorrymaker

1

-1

votes

1 answer

Defining dimension of NMT and image captioning with attention at the decoder part

I have been checking out models with attention in those tutorials below. https://www.tensorflow.org/tutorials/text/nmt_with_attention and https://www.tensorflow.org/tutorials/text/image_captioning In both tutorials, I do not understand the defining…

dimension decoder machine-translation attention-model

asked Apr 17 '20 at 05:08

Jun

3
1

-1

votes

1 answer

Adding softmax significantly changes weight updates

I have a neural network of the form N = W1 * Tanh(W2 * I), where I is the Input Vector/Matrix. When I learn these weights the output has a certain form. However, when I add a normalization layer, for example, N' = Softmax( W1 * Tanh(W2 * I) )…

neural-network deep-learning softmax attention-model

asked Oct 21 '17 at 16:05

Rumu

403
1
3
10

-2

votes

1 answer

Feeding an image to stacked resnet blocks to create an embedding

Do you have any code example or paper that refers to something like the following diagram? I want to know why we want to stack multiple resnet blocks as opposed to multiple convolutional block as in more traditional architectures? Any code sample…

pytorch computer-vision resnet attention-model self-attention

asked Aug 24 '21 at 04:10

Mona Jalal

34,860
64
239
408

-2

votes

1 answer

Getting Cuda Out of Memory while running Longformer Model in Google Colab. Similar code using Bert is working fine

I am working on text classification using Longformer Model. I took even just first 100 rows of dataframe. I am getting memory error. I am using google colab. This is my model : model =…

python machine-learning pytorch huggingface-transformers attention-model

asked Sep 29 '20 at 17:33

Sandeep Pathania

1
1
3

-3

votes

0 answers

Methods for Programmatically Generating 'Attention Is All You Need' Diagrams

Is there a way to create nodes that overlaps with each other to show there there is a "stack" of its type like that in the "attention is all you need" paper (maybe using Mermaid) or any other code-based methods? An example: If this is not possible…

python machine-learning diagram attention-model mermaid

asked Aug 27 '23 at 08:15

Don Yin

1

-3

votes

1 answer

Difference between Model(inputs=[input],outputs=[output1,output2]) and Model(inputs=[input],outputs=[output1]+output2) in KERAS?

Please check out the last line of the code

tensorflow machine-learning lstm tf.keras attention-model

asked Oct 18 '22 at 04:45

Vishal Singh

3
1

Questions tagged [attention-model]