Seq2Seq is a sequence to sequence learning add-on for the python deep learning library.
Questions tagged [seq2seq]
318 questions
2
votes
0 answers
What's the correct way of inference for a transformer model?
I'm beginner learning to build a standard transformer model based on PyTorch to solve an univariate sequence-to-sequence regression problem. The codes are written referring to the tutorial of PyTorch, but it turns out the training/validation error…

Haoran Li
- 21
- 3
2
votes
1 answer
Output directory is empty in Trainer
With my script the model is correctly trained and the results are printed, but the results directory is empty. How is that? What is lacking? I think I should have the files described in this answer.
training_args = Seq2SeqTrainingArguments(
…

zest16
- 455
- 3
- 7
- 20
2
votes
1 answer
Simple Transformers producing nothing?
I have a simple transformers script looking like this.
from simpletransformers.seq2seq import Seq2SeqModel, Seq2SeqArgs
args = Seq2SeqArgs()
args.num_train_epoch=5
model = Seq2SeqModel(
"roberta",
"roberta-base",
…

DevDog
- 111
- 2
- 9
2
votes
0 answers
ValueError: Input 0 of layer lstm_12 is incompatible with the layer: expected ndim=3, found ndim=4
I am working on seq2seq model, and I want to use the embedding layer given in Keras Blog Bonus FAQ. Here is my code, where num_encoder_tokens is 67 and num_decoder_tokens is 11.
I am getting the issue shown in figure.
Can anyone help me with…

Masoom Raj
- 71
- 5
2
votes
0 answers
Adding 'decoder_start_token_id' with SimpleTransformers
Training MBART in Seq2Seq with SimpleTransformers but getting an error I am not seeing with BART:
TypeError: shift_tokens_right() missing 1 required positional argument: 'decoder_start_token_id'
So far I've tried various combinations…

LeOverflow
- 301
- 1
- 2
- 16
2
votes
1 answer
Attention layer to keras seq2seq model
I have seen the keras now comes with Attention Layer. However, I have some problem using it in my Seq2Seq model.
This is the working seq2seq model without attention:
latent_dim = 300
embedding_dim = 200
clear_session()
# Encoder
encoder_inputs =…

BlueMango
- 463
- 7
- 21
2
votes
1 answer
Is Seq2Seq Models used for Time series only?
Can we use Seq2Seq model with input data that has no temporal relation ( not a time series )? For example I have a list of image regions that I would like to feed my seq2seq model. And the the model should predict an description ( output is time…

mousa alsulaimi
- 316
- 1
- 14
2
votes
0 answers
predict sequence of tuples using Transformer model
I am fairly new to seq2seq models and transformers.
Basically I am working on a sequence generation problem. I want to use the transformer. I am using python and pyTorch.
I know how the transformer model works for a sequence generation like given…

afsana mimi
- 53
- 1
- 5
2
votes
0 answers
Is there a way for a closed domain chatbot to build using seq2seq, generative modeling or other methods like RNNs?
Let's say I have a closed-domain chatbot, and it's knowledge base is in Finance. Thus, of course I want the chatbot to answer questions that the user might have like, "How is the best way to save money?", or "What are my spending habits like the…

jeff-ridgeway
- 171
- 16
2
votes
1 answer
Average of BLEU scores on two subsets of data is not the same as overall score
For evaluating a sequence generation model, I'm using BLEU1:BLEU4. I separated the test set to two sets and calculated the scores on each set separately, as well as, on the whole test set. Surprisingly, the results I get from the whole test set is…

forough
- 47
- 1
- 1
- 5
2
votes
2 answers
What is the equivalence of tf.contrib.seq2seq.prepare_attention in TensorFlow 2
I've recently working on some code written in tensorflow 1.0.1 and i want to make it available on tenorflow 2.
I am not very familiar with seq2seq.
Thank you very much.
(attention_keys,
attention_values,
attention_score_fn,
attention_construct_fn) =…

victor zhao
- 73
- 7
2
votes
1 answer
Difference between testing and inference in seq2seq models
I am studying transformer model with this tutorial and I am confused on the difference between evaluation and inference. In my understanding evaluation happens after the model is trained, by only giving it source and ask it to predict the target one…

Natalia
- 153
- 1
- 8
2
votes
1 answer
How to use TimeDistributed layer for predicting sequences of dynamic length? PYTHON 3
So I am trying to build an LSTM based autoencoder, which I want to use for the time series data. These are spitted up to sequences of different lengths. Input to the model has thus shape [None, None, n_features], where the first None stands for…

pikachu
- 690
- 1
- 6
- 17
2
votes
1 answer
BERT embeddings for abstractive text summarisation in Keras using encoder-decoder model
I am working on a text summarization task using encoder-decoder architecture in Keras. I would like to test the model's performance using different word embeddings such as GloVe and BERT. I already tested it out with GloVe embeddings but could not…

skaistt
- 95
- 5
2
votes
1 answer
Input 0 of layer lstm_35 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1966, 7059, 256]
I am creating a seq2seq model on word level embeddings for text summarisation and I am facing data shapes issue please help. thanks.
encoder_input=Input(shape=(max_encoder_seq_length,))
…

Hammad Asif
- 33
- 7