Need some help for deeplearning4j single RBM usage

Question

I have a bunch of sensors and I really just want to reconstruct the input.

So what I want is this:

after I have trained my model I will pass in my feature matrix
get the reconstructed feature matrix back
I want to investigate which sensor values are completely different from the reconstructed value

Therefore I thought a RBM will be the right choice and since I am used to Java, I have tried to use deeplearning4j. But I got stuck very early. If you run the following code, I am facing 2 problems.

The result is far away from a correct prediction, most of them are simply [1.00,1.00,1.00].
I would expect to get back 4 values (which is the number of inputs expected to be reconstructed)

So what do I have to tune to get a) a better result and b) get the reconstructed inputs back?

public static void main(String[] args) {
    // Customizing params
    Nd4j.MAX_SLICES_TO_PRINT = -1;
    Nd4j.MAX_ELEMENTS_PER_SLICE = -1;
    Nd4j.ENFORCE_NUMERICAL_STABILITY = true;
    final int numRows = 4;
    final int numColumns = 1;
    int outputNum = 3;
    int numSamples = 150;
    int batchSize = 150;
    int iterations = 100;
    int seed = 123;
    int listenerFreq = iterations/5;

    DataSetIterator iter = new IrisDataSetIterator(batchSize, numSamples);

    // Loads data into generator and format consumable for NN
    DataSet iris = iter.next();
    iris.normalize();
    //iris.scale();
    System.out.println(iris.getFeatureMatrix());

    NeuralNetConfiguration conf = new NeuralNetConfiguration.Builder()
            // Gaussian for visible; Rectified for hidden
            // Set contrastive divergence to 1
            .layer(new RBM.Builder()
                    .nIn(numRows * numColumns) // Input nodes
                    .nOut(outputNum) // Output nodes
                    .activation("tanh") // Activation function type
                    .weightInit(WeightInit.XAVIER) // Weight initialization
                    .lossFunction(LossFunctions.LossFunction.XENT)
                    .updater(Updater.NESTEROVS)
                    .build())
            .seed(seed) // Locks in weight initialization for tuning
            .iterations(iterations)
            .learningRate(1e-1f) // Backprop step size
            .momentum(0.5) // Speed of modifying learning rate
            .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT) // ^^ Calculates gradients
            .build();

    Layer model = LayerFactories.getFactory(conf.getLayer()).create(conf);
    model.setListeners(Arrays.asList((IterationListener) new ScoreIterationListener(listenerFreq)));

    model.fit(iris.getFeatureMatrix());
    System.out.println(model.activate(iris.getFeatureMatrix(), false));
}

please join us on Gitter where the community is active. We can help you there: https://gitter.im/deeplearning4j/deeplearning4j — racknuf, Oct 11 '15 at 00:04

xtof54 · Answer 1 · 2015-10-07T07:16:13.653

For b), when you call activate(), you get a list of "nlayers" arrays. Every array in the list is the activation for one layer. The array itself is composed of rows: 1 row per input vector; each column contains the activation for every neuron in this layer and this observation (input). Once all layers have been activated with some input, you can get the reconstruction with the RBM.propDown() method.

As for a), I'm afraid it's very tricky to train correctly an RBM. So you really want to play with every parameter, and more importantly, monitor during training various metrics that will give you some hint about whether it's training correctly or not. Personally, I like to plot:

The score() on the training corpus, which is the reconstruction error after every gradient update; check that it decreases.
The score() on another development corpus: useful to be warned when overfitting occurs;
The norm of the parameter vector: it has a large impact on the score
Both activation maps (= XY rectangular plot of the activated neurons of one layer over the corpus), just after initialization and after N steps: this helps detecting unreliable training (e.g.: when all is black/white, when a large part of all neurons are never activated, etc.)

Need some help for deeplearning4j single RBM usage

1 Answers1