Highest Voted 'attention-model' Questions

5

votes

0 answers

Retrieving attention weights for sentences? Most attentive sentences are zero vectors

I have a document classification task, that classifies documents as good (1) or bad (0), and I use some sentence embeddings for each document to classify the documents accordingly. What I like to do is retrieving the attention scores for each…

asked May 21 '21 at 14:37

Felix

313
1
3
22

5

votes

2 answers

Why use multi-headed attention in Transformers?

I am trying to understand why transformers use multiple attention heads. I found the following quote: Instead of using a single attention function where the attention can be dominated by the actual word itself, transformers use multiple attention…

nlp transformer-model attention-model

asked Feb 17 '21 at 14:38

SomeDutchGuy

2,249
4
16
42

5

votes

2 answers

Is there any way to convert pytorch tensor to tensorflow tensor

https://github.com/taoshen58/BiBloSA/blob/ec67cbdc411278dd29e8888e9fd6451695efc26c/context_fusion/self_attn.py#L29 I need to use mulit_dimensional_attention from the above link which is implemented in TensorFlow but I am using PyTorch so can I…

pytorch attention-model mindsdb

asked Mar 17 '20 at 12:01

waleed hamid

51
1
2
5

5

votes

1 answer

Is there a way to use the native tf Attention layer with keras Sequential API?

Is there a way to use the native tf Attention layer with keras Sequential API? I'm looking to use this particular class. I have found custom implementations such as this one. What I'm truly looking for is the use of this particular class with the…

tensorflow machine-learning keras deep-learning attention-model

asked Dec 12 '19 at 21:20

Wajd Meskini

94
1
6

5

votes

1 answer

Differences between different attention layers for Keras

I am trying to add an attention layer for my text classification model. The inputs are texts (e.g. movie review), the output is a binary outcome (e.g. positive vs negative). model = Sequential() model.add(Embedding(max_features, 32,…

tensorflow keras recurrent-neural-network attention-model

asked Oct 24 '19 at 18:01

Dr. Who

153
1
14

5

votes

1 answer

Cannot parse GraphDef file in function 'ReadTFNetParamsFromTextFileOrDie' in OpenCV-DNN TensorFlow

I want to wrap the attention-OCR model with OpenCV-DNN to increase inference time. I am using the TF code from the official TF models repo. For wrapping TF model with OpenCV-DNN, I am referring to this code. The cv2.dnn.readNetFromTensorflow()…

python opencv tensorflow attention-model

asked Mar 05 '19 at 16:42

Chintan

454
6
15

5

votes

0 answers

how to access the attention weights from the attention class

class AttLayer(Layer): def __init__(self, **kwargs): self.init = initializations.get('normal') #self.input_spec = [InputSpec(ndim=3)] super(AttLayer, self).__init__(** kwargs) def build(self, input_shape): …

python keras attention-model

asked Apr 26 '17 at 03:58

prashant ranjan

51
2

4

votes

1 answer

tf.keras.layers.MultiHeadAttention's argument key_dim sometimes not matches to paper's example

For example, I have input with shape (1, 1000, 10) (so, src.shape wil be (1, 1000, 10), which means the sequence length is 1000, and the dimension is 10. Then: This works (random num_head and key_dim): class Model(tf.keras.Model): def…

tensorflow tf.keras transformer-model attention-model

asked Jul 01 '22 at 18:03

EthanJiang

43
4

4

votes

1 answer

Multi-Head attention layers - what is a warpper multi-head layer in Keras?

I am new to attention mechanisms and I want to learn more about it by doing some practical examples. I came across a Keras implementation for multi-head attention found it in this website Pypi keras multi-head. I found two different ways to…

tensorflow keras deep-learning transformer-model attention-model

asked Oct 26 '20 at 07:25

Amhs_11

233
3
10

4

votes

1 answer

Why is my attention model worse than non-attention model

My task was to convert english sentence to German sentence. I first did this with normal encoder-decoder network, on which I got fairly good results. Then, I tried to solve the same task with the same exact model as before, but with Bahdanau…

machine-learning deep-learning nlp attention-model encoder-decoder

asked Oct 25 '20 at 08:30

user14349917

4

votes

3 answers

Can't set the attribute "trainable_weights", likely because it conflicts with an existing read-only

My code was running perfectly in colab. But today it's not running. It says Can't set the attribute "trainable_weights", likely because it conflicts with an existing read-only @property of the object. Please choose a different name. I am using LSTM…

nlp lstm attention-model

asked Aug 10 '20 at 15:50

Rohan kumar Yadav

41
1
3

4

votes

1 answer

TransformerEncoder with a padding mask

I'm trying to implement torch.nn.TransformerEncoder with a src_key_padding_mask not equal to none. Imagine the input is of the shape src = [20, 95] and the binary padding mask has the shape src_mask = [20, 95], 1 in the position of padded tokens and…

pytorch transformer-model attention-model

asked Jun 16 '20 at 00:43

Pourya Vakilipourtakalou

71
1
1
6

4

votes

1 answer

Implementation details of positional encoding in transformer model?

How exactly does this positional encoding being calculated? Let's assume a machine translation scenario and these are input sentences, english_text = [this is good, this is bad] german_text = [das ist gut, das ist schlecht] Now our input vocabulary…

encoding deep-learning nlp transformer-model attention-model

asked May 01 '20 at 21:18

Sai Kumar

665
2
9
21

4

votes

1 answer

How do attention network works?

Recently I was going through Attention is all you need paper, ongoing through it I found an issue regarding understanding the attention network if I ignore the maths behind it. Can anyone make me understand the attention network with an example?

text nlp transformer-model attention-model

asked Dec 05 '19 at 08:38

Kumar Mangalam

748
7
12

4

votes

0 answers

How to use tfa.seq2seq.BahdanauAttention with tf.keras functional API?

I want to use tfa.seq2seq.BahdanauAttention with functional API of tf.keras. I have looked at the example given at tensorflow/nmt/attention_model.py. But I couldn't figure out how to use it with tf.keras's functional API. So I would like to use…

tensorflow keras tensorflow2.0 tf.keras attention-model

asked Nov 12 '19 at 15:58

Manideep

41
5

Questions tagged [attention-model]