1

Why a Graves-LSTM layer Cell has 11 weigths, and what is their purpose?

Given below example can generate weigth list:

MultiLayerNetwork model = new MultiLayerNetwork(new NeuralNetConfiguration.Builder()
        .list()
        .layer(0, new GravesLSTM.Builder()
                .nIn(1)
                .nOut(1)
                .activation("sigmoid")
                .weightInit(WeightInit.ZERO)
                .build()
                )
        .build());
model.init();

System.out.println("Weigths: " + model.paramTable());

out:

Weigths:

{0_W=[0.00, 0.00, 0.00, 0.00], 0_RW=[0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00], 0_b=[0.00, 1.00, 0.00, 0.00]}

( 11 weigth + 4 bias )

in contrast here is the output using DenseLayer instead of GravesLSTM:

Weigths: {0_W=0.00, 0_b=0.00}

( 1 weight + 1 bias, this is clear. )

Daniel Hári
  • 7,254
  • 5
  • 39
  • 54
  • Please join the DL4J community on Gitter. A lot of people there can answer your question: https://gitter.im/deeplearning4j/deeplearning4j – racknuf Oct 31 '16 at 03:29
  • We answered this in the community already. The extra weights are related to the extra gates an LSTM has (eg: the forget gate) – Adam Gibson Oct 31 '16 at 05:31

0 Answers0