2

Is there a way to pass extra feature tokens along with the existing word token (training features/source file vocabulary) and feed it to the encoder RNN of seq2seq?. Since, it currently accepts only one word token from the sentence at a time.

Let me put this in a more concrete fashion; Consider the example of machine translation/nmt - say I have 2 more feature columns for the corresponding source vocabulary set( Feature1 here ). For example, consider this below:

+---------+----------+----------+
|Feature1 | Feature2 | Feature3 | 
+---------+----------+----------+
|word1    |    x     |     a    |
|word2    |    y     |     b    |
|word3    |    y     |     c    |
|.        |          |          |
|.        |          |          |
+---------+----------+----------+

To summarise, currently seq2seq dataset is the parallel data corpora has a one-to one mapping between he source feature(vocabulary,i.e Feature1 alone) and the target(label/vocabulary). I'm looking for a way to map more than one feature(i.e Feature1, Feature2,Feature3) to the target(label/vocabulary).

Moreover, I believe this is glossed over in the seq2seq-pytorch tutorial(https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb) as quoted below:

When using a single RNN, there is a one-to-one relationship between inputs and outputs. We would quickly run into problems with different sequence orders and lengths that are common during translation…….With the seq2seq model, by encoding many inputs into one vector, and decoding from one vector into many outputs, we are freed from the constraints of sequence order and length. The encoded sequence is represented by a single vector, a single point in some N dimensional space of sequences. In an ideal case, this point can be considered the "meaning" of the sequence.

Furthermore, I tried tensorflow and took me a lot of time to debug and make appropriate changes and got nowhere. And heard from my colleagues that pytorch would have the flexibility to do so and would be worth checking out.

Please share your thoughts on how to achieve the same in tensorflow or pytorch. Would be great of anyone tells how to practically implement/get this done. Thanks in advance.

entrophy
  • 2,065
  • 14
  • 20
siv
  • 31
  • 5
  • I'm not sure I understand your question yet. For your words, are you using dense word_embeddings? If so, can you not just append your additional features as extra dimensions to the existing word embeddings? If you're using one_hot embeddings, the same approach should work too in principle. Even though the computation may be a little tedious, depending on how you create and store your one-hot embeddings. – mbpaulus Jul 23 '17 at 16:27
  • cant you simply concat the features before feeding it to the encoder? – timbmg Jul 24 '17 at 14:59
  • What are your additional features? – finbarr Jul 24 '17 at 21:44
  • @FinbarrTimbers Categorical word features from a different vocab set of <50 size – siv Jul 25 '17 at 07:42
  • @siv did you find a solution to this? I am facing exactly the same issue. – MrfksIV Jul 13 '18 at 19:19
  • @MrfksIV I wrote a blog post on this here--https://iamsiva11.github.io/extra-features-seq2seq/ – siv Jul 16 '18 at 04:30
  • Hey, what if I had multiple outputs for a single input? Ex: Lets say for every english sentence I have 3 possible french translations for better generalization. How would I train this? Will passing in 3 different translations at each timestep in the decoder work? or should I merge all these 3 sentences into a single sentence and iterate? This is literally the one2many seq2seq problem – karthikeyan Oct 08 '19 at 13:13

0 Answers0