Highest Voted 'seq2seq' Questions

2

votes

0 answers

What's the correct way of inference for a transformer model?

I'm beginner learning to build a standard transformer model based on PyTorch to solve an univariate sequence-to-sequence regression problem. The codes are written referring to the tutorial of PyTorch, but it turns out the training/validation error…

asked Jul 13 '22 at 22:32

Haoran Li

21
3

2

votes

1 answer

Output directory is empty in Trainer

With my script the model is correctly trained and the results are printed, but the results directory is empty. How is that? What is lacking? I think I should have the files described in this answer. training_args = Seq2SeqTrainingArguments( …

python huggingface-transformers machine-translation seq2seq

asked May 31 '22 at 10:36

zest16

455
3
7
20

2

votes

1 answer

Simple Transformers producing nothing?

I have a simple transformers script looking like this. from simpletransformers.seq2seq import Seq2SeqModel, Seq2SeqArgs args = Seq2SeqArgs() args.num_train_epoch=5 model = Seq2SeqModel( "roberta", "roberta-base", …

python python-3.x seq2seq simpletransformers

asked Feb 21 '22 at 01:19

DevDog

111
2
9

2

votes

0 answers

ValueError: Input 0 of layer lstm_12 is incompatible with the layer: expected ndim=3, found ndim=4

I am working on seq2seq model, and I want to use the embedding layer given in Keras Blog Bonus FAQ. Here is my code, where num_encoder_tokens is 67 and num_decoder_tokens is 11. I am getting the issue shown in figure. Can anyone help me with…

python tensorflow keras lstm seq2seq

asked Nov 30 '21 at 05:23

Masoom Raj

71
5

2

votes

0 answers

Adding 'decoder_start_token_id' with SimpleTransformers

Training MBART in Seq2Seq with SimpleTransformers but getting an error I am not seeing with BART: TypeError: shift_tokens_right() missing 1 required positional argument: 'decoder_start_token_id' So far I've tried various combinations…

python huggingface-transformers seq2seq huggingface-tokenizers simpletransformers

asked Nov 04 '21 at 20:44

LeOverflow

301
1
2
16

2

votes

1 answer

Attention layer to keras seq2seq model

I have seen the keras now comes with Attention Layer. However, I have some problem using it in my Seq2Seq model. This is the working seq2seq model without attention: latent_dim = 300 embedding_dim = 200 clear_session() # Encoder encoder_inputs =…

python tensorflow keras attention-model seq2seq

asked Jul 13 '21 at 14:30

BlueMango

463
7
21

2

votes

1 answer

Is Seq2Seq Models used for Time series only?

Can we use Seq2Seq model with input data that has no temporal relation ( not a time series )? For example I have a list of image regions that I would like to feed my seq2seq model. And the the model should predict an description ( output is time…

nlp computer-vision lstm seq2seq

asked Feb 10 '21 at 22:17

mousa alsulaimi

316
1
14

2

votes

0 answers

predict sequence of tuples using Transformer model

I am fairly new to seq2seq models and transformers. Basically I am working on a sequence generation problem. I want to use the transformer. I am using python and pyTorch. I know how the transformer model works for a sequence generation like given…

python-3.x pytorch transformer-model seq2seq

asked Dec 18 '20 at 14:57

afsana mimi

53
1
5

2

votes

0 answers

Is there a way for a closed domain chatbot to build using seq2seq, generative modeling or other methods like RNNs?

Let's say I have a closed-domain chatbot, and it's knowledge base is in Finance. Thus, of course I want the chatbot to answer questions that the user might have like, "How is the best way to save money?", or "What are my spending habits like the…

machine-learning chatbot recurrent-neural-network seq2seq blenderbot

asked Oct 23 '20 at 12:03

jeff-ridgeway

171
16

2

votes

1 answer

Average of BLEU scores on two subsets of data is not the same as overall score

For evaluating a sequence generation model, I'm using BLEU1:BLEU4. I separated the test set to two sets and calculated the scores on each set separately, as well as, on the whole test set. Surprisingly, the results I get from the whole test set is…

metrics evaluation seq2seq bleu

asked Aug 22 '20 at 01:36

forough

47
1
1
5

2

votes

2 answers

What is the equivalence of tf.contrib.seq2seq.prepare_attention in TensorFlow 2

I've recently working on some code written in tensorflow 1.0.1 and i want to make it available on tenorflow 2. I am not very familiar with seq2seq. Thank you very much. (attention_keys, attention_values, attention_score_fn, attention_construct_fn) =…

python tensorflow tensorflow2.0 seq2seq

asked Jul 12 '20 at 03:58

victor zhao

73
7

2

votes

1 answer

Difference between testing and inference in seq2seq models

I am studying transformer model with this tutorial and I am confused on the difference between evaluation and inference. In my understanding evaluation happens after the model is trained, by only giving it source and ask it to predict the target one…

machine-learning training-data transformer-model seq2seq

asked Jun 29 '20 at 11:12

Natalia

153
1
8

2

votes

1 answer

How to use TimeDistributed layer for predicting sequences of dynamic length? PYTHON 3

So I am trying to build an LSTM based autoencoder, which I want to use for the time series data. These are spitted up to sequences of different lengths. Input to the model has thus shape [None, None, n_features], where the first None stands for…

tensorflow keras lstm autoencoder seq2seq

asked Jun 03 '20 at 06:48

pikachu

690
1
6
17

2

votes

1 answer

BERT embeddings for abstractive text summarisation in Keras using encoder-decoder model

I am working on a text summarization task using encoder-decoder architecture in Keras. I would like to test the model's performance using different word embeddings such as GloVe and BERT. I already tested it out with GloVe embeddings but could not…

python keras nlp seq2seq bert-language-model

asked May 24 '20 at 20:27

skaistt

95
5

2

votes

1 answer

Input 0 of layer lstm_35 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1966, 7059, 256]

I am creating a seq2seq model on word level embeddings for text summarisation and I am facing data shapes issue please help. thanks. encoder_input=Input(shape=(max_encoder_seq_length,)) …

python tensorflow keras-layer seq2seq lstm-stateful

asked Apr 28 '20 at 07:30

Hammad Asif

33
7

Questions tagged [seq2seq]