Keras-contrib CRF layer "learn-mode" parameter

Question

I have just started using CRF layer provided in keras-contrib library for NER (named entity recognition) task. The problem I've faced was that while training the model with default parameters, loss is becoming nan value in the end of 1st epoch, and never changes.

The thing helped me was changing learn_mode parameter of CRF layer to 'marginal'.

Could anyone please explain the difference between 'join' and 'marginal' learn_mode? Why in my case (NER problem) 'join' mode leads to nan value? Why 'marginal' works?

# input and embedding for words
word_in = Input(shape=(max_len_doc,))
emb_word = Embedding(input_dim=n_words + 2, output_dim=50, 
input_length=max_len_doc, mask_zero=True)(word_in)

# input and embeddings for characters
char_in = Input(shape=(max_len_doc, max_len_word,))
emb_char = TimeDistributed(Embedding(input_dim=n_chars + 2, output_dim=10, 
input_length=max_len_word, mask_zero=True))(char_in)

# character LSTM to get word encodings by characters
char_enc = TimeDistributed(LSTM(units=50, return_sequences=False,
recurrent_dropout=0.5))(emb_char)

# main LSTM
model_crf = concatenate([emb_word, char_enc])
model_crf = SpatialDropout1D(0.3)(model_crf)
model_crf = Bidirectional(LSTM(units=128, return_sequences=True, recurrent_dropout=0.6))(model_crf)
model_crf = Bidirectional(LSTM(units=128, return_sequences=True, recurrent_dropout=0.3))(model_crf)
model_crf = TimeDistributed(Dense(n_tags, activation="relu"))(model_crf)
crf = CRF(n_tags) # crf = CRF(n_tags, learn_mode='marginal')
out = crf(model_crf)

Hi @Dilshat, did you manage to solve it? Thanks – eng2019 Dec 05 '19 at 09:35 — eng2019, Dec 05 '19 at 09:35

Keras-contrib CRF layer "learn-mode" parameter

0 Answers0