Questions tagged [encoder-decoder]
184 questions
1
vote
0 answers
Attention mechanism with different layer size?
i have two layer with different layer sizes (hidden states) how can i perform encoder decoder type of attention on these layers if the layer sizes are different? because i will do dot product, how?
Consider I have two layers:
lstm1 = LSTM(20,…

Josef Souza
- 11
- 1
1
vote
1 answer
ValueError: bytes must be in range(0, 256) while decoding input tensor using transformer AutoTokenizer (MT5ForConditionalGerneration Model)
Relevant Code :
from transformers import (
AdamW,
MT5ForConditionalGeneration,
AutoTokenizer,
get_linear_schedule_with_warmup
)
tokenizer = AutoTokenizer.from_pretrained('google/byt5-small',…

iamabhaykmr
- 1,803
- 3
- 24
- 49
1
vote
0 answers
PyTorch Forecasting - Temporal Fusion Transformer calculate_prediction_actual_by_variable() plots empty
referring to the tutorial (https://pytorch-forecasting.readthedocs.io/en/stable/tutorials/stallion.html) provided by Pytorch about their implementation of the Temporal Fusion Transformer, I'm trying to use their…

Francesco Mangia
- 11
- 2
1
vote
0 answers
Masked self-attention in tranformer's decoder
I'm writing my thesis about attention mechanisms. In the paragraph in which I explain the decoder of transformer I wrote this:
The first sub-layer is called masked self-attention, in which the masking operation consists in preventing the decoder…

CarlaDP
- 11
- 1
1
vote
0 answers
No gradients provided for any variable, Tensorflow error on TPU
So I am trying to train my Text Summarization model on Colab TPU as training it on Colab CPU is very slow but I am getting a No gradients provided for any variable Error, this error does not appear when training on CPU with or without using…

Peter Austin
- 23
- 5
1
vote
2 answers
a mismatch between the current graph and the graph
I am trying to train encoder decoder model with multispectral images having 9 channels but the code that i am running is downloading pretrained resnet101 weights which is trained on 3 channel images.
Input Given by me:
net_input =…

user3449214
- 53
- 2
- 10
1
vote
0 answers
ValueError: Shapes (None, 16) and (None, 16, 16) are incompatible (LSTMs)
I am building a English to Hindi translation model and I keep getting this error. I am still new to this so I couldn't figure out my error. I used the encoder-decoder model and i still have to build the inference model for decoder. I referred my…

Akshat Mittu
- 11
- 1
- 1
1
vote
0 answers
Calculate F-score for GEC
I am working on Sequence to Sequence encoder-decoder model with bidirectional GRU for the task of grammar error detection and correction for Arabic language. I want to calculate the F0.5 score for my model.
This is how my data divided:
train_data,…

Moodhi
- 45
- 3
1
vote
0 answers
TensorFlow -- Invalid argument: assertion failed: [Condition x == y did not hold element-wise:]
I am trying to create an encoder-decoder RNN that adds sequence_lengths as an input to the model, to tell the model to ignore padding (essentially masking). The problem is when I do this, I get a really weird error message that I can't make sense…

jda5
- 1,390
- 5
- 17
1
vote
0 answers
Trained E2E speech recognition model does not recognize even training data correctly
I trained an E2E speech recognition model using Conformer encoder and Transformer decoder with Hybrid CTC/Attention, but it does not recognize even the training data correctly.
I trained about 20 epochs of this model. However, it did not drop from…

Lightning
- 11
- 1
1
vote
0 answers
Dimensional error in the non-linguistic dataset as input to LSTM based Encoder-decoder model using attention
I'm trying to implement attention -LSTM based encoder decoder model for multi-class classification. The dataset is non-linguistic in nature.
Characteristics of my dataset:
x_train.shape = (930,5)
y_train.shape = (405,5)
x_test.shape =…

Sukhmani Kaur Thethi
- 167
- 1
- 10
1
vote
0 answers
Split Encoder and Decoder part in AUTOENCODER with skip connections - KERAS
I am trying to implement AUTOENCODER with skip connections and split the Encoder and Decoder parts. So, I can work on latent space and Decoder part.
Below is the architecture in keras I used:
## ENCODER
inputs_encoder = Input(shape = (256,…
1
vote
0 answers
Run hugging-face BART model decoder with a BART encoder which has had the attention heads overwritten
I am looking to build a pipeline that applies the hugging-face BART model step-by-step. Once I have built the pipeline, I will be looking to substitute the encoder attention heads with a pre-trained / pre-defined encoder attention head.
The pipeline…

78282219
- 159
- 1
- 12
1
vote
0 answers
Deep Learning, NLP, Valuenet: TypeError: string indices must be integers
I'm getting this error while trying to implement the project Valuenet https://github.com/brunnurs/valuenet on my laptop and I don't understand what exactly it means.
I'm getting this error after I try to train the Model using:- python…

Cj krn
- 11
- 1
1
vote
1 answer
How to average the two images, feed them as input to a network and output the two separate images that were used in the average input?
Given a pair of images from CIFAR10, average the two images, feed them as input to a network and output the two separate images that were used in the average input.
I am currently using conditional GAN and autoencoders to achieve this task. But have…

Ammar N. Abbas
- 43
- 4