The code below comes from https://deeplearning4j.org. I don't quite get the nIn and nOut params. Does the definition below create 2 layers, or 3 with one hidden layer of 1.000 neurons? And what would happen if the nOut of layer 0 would not match nIn of layer 1? Does this always have to be the same number (in this case 1.000)?
.layer(0, new DenseLayer.Builder()
.nIn(numRows * numColumns) // Number of input datapoints.
.nOut(1000) // Number of output datapoints.
.activation("relu") // Activation function.
.weightInit(WeightInit.XAVIER) // Weight initialization.
.build())
.layer(1, new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD)
.nIn(1000)
.nOut(outputNum)
.activation("softmax")
.weightInit(WeightInit.XAVIER)
.build())
.pretrain(false).backprop(true)
.build();