6

Based on this article, I wrote this model:

enc_in=Input(shape=(None,in_alphabet_len))
lstm=LSTM(lstm_dim,return_sequences=True,return_state=True,use_bias=False)
enc_out,h,c=lstm(enc_in)
dec_in=Input(shape=(None,in_alphabet_len))
decoder,_,_=LSTM(decoder_dim,return_sequences=True,return_state=True)(dec_in,initial_state=[h,c])
decoder=Dense(units=in_alphabet_len,activation='softmax')(decoder)
model=Model([enc_in,dec_in],decoder) 

How can I add attention layer to this model before decoder?

wonea
  • 4,783
  • 17
  • 86
  • 139
Osm
  • 81
  • 4

1 Answers1

0

You can use this repo,

  1. you will need to pip install keras-self-attention
  2. import layer from keras_self_attention import SeqSelfAttention
    • if you want to use tf.keras not keras, add the following before the import os.environ['TF_KERAS'] = '1'
    • Make sure if you are using keras to omit the previous flag as it will cause inconsistencies
  3. Since you are using keras functional API,

    enc_out, h, c = lstm()(enc_in)
    att = SeqSelfAttention()(enc_out)
    dec_in = Input(shape=(None, in_alphabet_len))(att)
    

    I hope this answers your question, and future readers

ElSheikh
  • 321
  • 6
  • 28
  • Well this is self attention. Yet, for seq2seq you will normally want to have attention between encoder and decoder states. – lwi May 07 '20 at 12:23
  • So, what do you suggest? – ElSheikh May 07 '20 at 14:19
  • 1
    I don't know if there is a Keras wrapper for Bahdanau or Luong attention, yet there is a neat TensorFlow 2.0 tutorial for seq2seq translation with attention. https://www.tensorflow.org/tutorials/text/nmt_with_attention – lwi May 08 '20 at 08:08