Questions tagged [encoder-decoder]

184 questions
1
vote
0 answers

Attention mechanism with different layer size?

i have two layer with different layer sizes (hidden states) how can i perform encoder decoder type of attention on these layers if the layer sizes are different? because i will do dot product, how? Consider I have two layers: lstm1 = LSTM(20,…
1
vote
1 answer

ValueError: bytes must be in range(0, 256) while decoding input tensor using transformer AutoTokenizer (MT5ForConditionalGerneration Model)

Relevant Code : from transformers import ( AdamW, MT5ForConditionalGeneration, AutoTokenizer, get_linear_schedule_with_warmup ) tokenizer = AutoTokenizer.from_pretrained('google/byt5-small',…
1
vote
0 answers

PyTorch Forecasting - Temporal Fusion Transformer calculate_prediction_actual_by_variable() plots empty

referring to the tutorial (https://pytorch-forecasting.readthedocs.io/en/stable/tutorials/stallion.html) provided by Pytorch about their implementation of the Temporal Fusion Transformer, I'm trying to use their…
1
vote
0 answers

Masked self-attention in tranformer's decoder

I'm writing my thesis about attention mechanisms. In the paragraph in which I explain the decoder of transformer I wrote this: The first sub-layer is called masked self-attention, in which the masking operation consists in preventing the decoder…
1
vote
0 answers

No gradients provided for any variable, Tensorflow error on TPU

So I am trying to train my Text Summarization model on Colab TPU as training it on Colab CPU is very slow but I am getting a No gradients provided for any variable Error, this error does not appear when training on CPU with or without using…
1
vote
2 answers

a mismatch between the current graph and the graph

I am trying to train encoder decoder model with multispectral images having 9 channels but the code that i am running is downloading pretrained resnet101 weights which is trained on 3 channel images. Input Given by me: net_input =…
user3449214
  • 53
  • 2
  • 10
1
vote
0 answers

ValueError: Shapes (None, 16) and (None, 16, 16) are incompatible (LSTMs)

I am building a English to Hindi translation model and I keep getting this error. I am still new to this so I couldn't figure out my error. I used the encoder-decoder model and i still have to build the inference model for decoder. I referred my…
1
vote
0 answers

Calculate F-score for GEC

I am working on Sequence to Sequence encoder-decoder model with bidirectional GRU for the task of grammar error detection and correction for Arabic language. I want to calculate the F0.5 score for my model. This is how my data divided: train_data,…
Moodhi
  • 45
  • 3
1
vote
0 answers

TensorFlow -- Invalid argument: assertion failed: [Condition x == y did not hold element-wise:]

I am trying to create an encoder-decoder RNN that adds sequence_lengths as an input to the model, to tell the model to ignore padding (essentially masking). The problem is when I do this, I get a really weird error message that I can't make sense…
1
vote
0 answers

Trained E2E speech recognition model does not recognize even training data correctly

I trained an E2E speech recognition model using Conformer encoder and Transformer decoder with Hybrid CTC/Attention, but it does not recognize even the training data correctly. I trained about 20 epochs of this model. However, it did not drop from…
1
vote
0 answers

Dimensional error in the non-linguistic dataset as input to LSTM based Encoder-decoder model using attention

I'm trying to implement attention -LSTM based encoder decoder model for multi-class classification. The dataset is non-linguistic in nature. Characteristics of my dataset: x_train.shape = (930,5) y_train.shape = (405,5) x_test.shape =…
1
vote
0 answers

Split Encoder and Decoder part in AUTOENCODER with skip connections - KERAS

I am trying to implement AUTOENCODER with skip connections and split the Encoder and Decoder parts. So, I can work on latent space and Decoder part. Below is the architecture in keras I used: ## ENCODER inputs_encoder = Input(shape = (256,…
1
vote
0 answers

Run hugging-face BART model decoder with a BART encoder which has had the attention heads overwritten

I am looking to build a pipeline that applies the hugging-face BART model step-by-step. Once I have built the pipeline, I will be looking to substitute the encoder attention heads with a pre-trained / pre-defined encoder attention head. The pipeline…
1
vote
0 answers

Deep Learning, NLP, Valuenet: TypeError: string indices must be integers

I'm getting this error while trying to implement the project Valuenet https://github.com/brunnurs/valuenet on my laptop and I don't understand what exactly it means. I'm getting this error after I try to train the Model using:- python…
1
vote
1 answer

How to average the two images, feed them as input to a network and output the two separate images that were used in the average input?

Given a pair of images from CIFAR10, average the two images, feed them as input to a network and output the two separate images that were used in the average input. I am currently using conditional GAN and autoencoders to achieve this task. But have…