0

Hi everyone I'm trying to solve the TIMIT task by applying CNN + Dense + CTC

So basically here is my model:

1) Some ConvLayers2D.

2) Transformation of shape

3) Dense

4) CTC

So the transformation is :

After CNNs I get an output of shape (Batch_size,number_of_feature_maps,41, sequence_length) 41 being the Mel filter bank / energy

I turn it to (Batch_size,sequence_length,41*number_of_feature_maps) to get a dim of 3 with: enter image description here

Notice that the sequence_length is None, since it varies for each mini_batch so we have something like (None,None, X)

And then I basically tried two things, here are the codes: enter image description here

and

enter image description here

I basically don't get the behaviors of these two methods. The first one with TimeDistributed just works, the Loss and Phoneme Error Rate decrease. The problem is that the second works too ! . What does the Dense layer do on (None,None,X) tensors ?

Thanks !

1 Answers1

0

Check out: Keras LSTM dense layer multidimensional input

In case of keras < 2.0: you need to use TimeDistributed wrapper in order to apply it element-wise to a sequence. In case of keras >= 2.0: Dense layer is applied element-wise by default.

BGraf
  • 607
  • 4
  • 12
  • ok so basically if Keras >= 2.0 a Dense layer with (X, Y, Z) will have the same behavior than a TimeDistributed(Dense) with (X,Y,Z) ? Weird ahah thanks. – Titouan Parcollet Mar 11 '18 at 21:23