Highest Voted 'self-attention' Questions

0

votes

0 answers

Tensorflow Attention ValueError: Dimension must be 5 but is 4

I am trying to follow the below code for a self-attention model. The self-attention networks have 16 heads, and the output of each head is 16-dimensional. The dimension of the additive attention query vectors is 200. def __init__(self, nb_head,…

asked Mar 26 '23 at 12:25

Mer Ka

1
1

0

votes

0 answers

Using Self-Attention Layer in Keras without Encoder-Decoder

Attention has been used with encoder decoder to my knowledge. I am trying to use it as a layer in a feedforward neural network. I have the following archeticturE: Input layer -> Dense Layer -> Self-Attention Layer -> Dense Layer -> SoftMax…

keras attention-model self-attention

asked Mar 08 '23 at 03:05

Avv

429
4
17

0

votes

0 answers

Creating attention mask for transformer block with (batch size, sequence length, spatial samples, embed dim) as input

I am trying to use a transformer to analyze some spatio-temporal data. I have an array of training data with dimensions "batch size x sequence length x spatial samples x embedding dimension." In order to prevent the transformer from cheating while…

python tensorflow keras transformer-model self-attention

asked Feb 27 '23 at 22:25

Innocuous Rift

61
6

0

votes

0 answers

Positional Embedding in Transformers - Time Series Data

I'm adding Multi-Headed attention at the input of my CNN to improve interpretability and explainability of my model. The data is formed as time-series 3D input of shape (125, 5, 6) where 5x6 part represents the data in a single sample and 125…

deep-learning time-series attention-model self-attention multihead-attention

asked Feb 08 '23 at 13:14

AMcoding

3
3

0

votes

1 answer

TypeError: call() got an unexpected keyword argument 'use_causal_mask' ---> getting this error on flickr8k/flickr30k dataset

Error TypeError Traceback (most recent call last) /tmp/ipykernel_23/1382744270.py in 2 image_path = tf.keras.utils.get_file('surf.jpg', origin=image_url) 3 image = load_image(image_path) ----> 4…

tensorflow self-attention

asked Feb 06 '23 at 07:02

Arijit Hazra

3
1

0

votes

0 answers

What makes the multi-head self-attention matrices different?

Transformers (BERT) use one set of three matrices, Q, K, V for each attention head. BERT uses 12 attention heads in each layer, with each attention head having it's own set of three such matrices. The actual values of these 36 matrices are obtained…

bert-language-model transformer-model attention-model self-attention

asked Dec 30 '22 at 21:33

gauss

9
2

0

votes

0 answers

How to reveal relations between number of words and target with self-attention based models?

Transformers can handle variable length input, but what if the number of words might correlate with the target? Let's say we want to perform a sentiment analysis for some reviews where the longer reviews are more probable to be bad. How can the…

deep-learning nlp transformer-model self-attention

asked Dec 21 '22 at 13:48

tusker

57
5

0

votes

0 answers

Implementing Shaw's Relative Attention using Tensorflow

Is there a straightforward way to implement relative positional encoding as described in the Shaw paper using Tensorflow instead of absolute positional encoding? Thanks!

tensorflow self-attention

asked Nov 10 '22 at 22:12

user3444632

1
2

0

votes

0 answers

Why can the key dimension be different than the input sequence length for self attention on a time series?

In Timeseries classification with a Transformer model, the author builds a transformer encoder like this def transformer_encoder(inputs, head_size, num_heads, ff_dim, dropout=0): x = layers.LayerNormalization(epsilon=1e-6)(inputs) x =…

python transformer-model self-attention

asked Nov 05 '22 at 18:49

aez

2,406
2
26
46

0

votes

0 answers

Simple self-Attention API to learn from vector sequence

I wanted to implement simple softmax-based self-attention for a sequence of vectors. Using PyTorch's multi-head self-attention API seems overwhelming for my task with a large number of parameters to train. Is there any API/ simple codebase to do…

pytorch huggingface-transformers transformer-model attention-model self-attention

asked Oct 26 '22 at 16:33

Ramraj Chandradevan

141
2
10

0

votes

0 answers

how to check transformer model's attention?

I am doing a project on text summarization. I am also showing the attention mask for training. model is working well but I want to check and show those words on which during training model is giving attention for prediction long to short summary.

nlp transformer-model attention-model encoder-decoder self-attention

asked Oct 19 '22 at 15:23

rahul verma

1
2

0

votes

1 answer

How to generate vision transformer attention maps for 3D grayscale MRI data

How can I generate attention maps for 3D grayscale MRI data after training with vision transformer for a classification problem? My data shape is (120,120,120) and the model is 3D ViT. For example: img = nib.load() img = torch.from_numpy(img) model…

deep-learning pytorch computer-vision transformer-model self-attention

asked Oct 18 '22 at 23:14

Panda

1

0

votes

0 answers

sparse attention and its relation with attention mask

Can anyone please explain in a clear way what is the usage of mask in attention for sparse attention? I just can not get how masking tokens (I do not mean here pad tokens) can make attention faster as example as mentioned in sparse attention…

sparse-matrix self-attention

asked Oct 15 '22 at 15:09

Arij Aladel

356
1
3
10

0

votes

0 answers

How to get inter stock relationship using Deep Learning?

i'm trying to get relationship between stock companies based on their historical closing prices. Cross-correlation or other similarity matrices can perform this task. But i want use deep learning methods(RNN/attention) to extract relationship…

time-series correlation recurrent-neural-network stock self-attention

asked Oct 12 '22 at 21:21

Md. Kamrul Hasan Tuhin

1
1

0

votes

0 answers

Confusion regarding num_heads & key_dim keras.layers.MultiHeadAttention in the transformer tutorial

In the tf.keras tutorial: https://colab.research.google.com/github/tensorflow/text/blob/master/docs/tutorials/transformer.ipynb, class EncoderLayer(tf.keras.layers.Layer): def __init__(self,*, d_model, # Input/output…

tensorflow keras tf.keras attention-model self-attention

asked Sep 29 '22 at 19:28

kawingkelvin

3,649
2
30
50

Questions tagged [self-attention]