0

My problem is I am training my FeedForward NN to predict how long it takes to train from one to another station. It has two hidden layers (128 and 64) and using TANH. First what doesnt make sense for me is why my model predict better on validating dataset. And at one point, loss starts oscillating.

enter image description here

I chcecked my data and they are different, no duplicates. Maybe it is because data are very similar like, same routes, same train types and thats the reason for this bahavior?

I am USING DL4J. And validation dataset is 10% of training set. And my dataset contains more than 130 000 rows (for this particular example). Edit: Here are values I am plotting.

for (int i = 0; i < nEpochs; i++) {
        trainingSetIterator.reset();
        model.fit(trainingSetIterator);

        System.out.println(i + ": " + model.evaluateRegression(validationIterator).averagerootMeanSquaredError() + " || "
                + model.evaluateRegression(trainingSetIterator).averagerootMeanSquaredError());

        validationValues[i] = model.evaluateRegression(validationIterator).averagerootMeanSquaredError();
        trainValues[i] = model.evaluateRegression(trainingSetIterator).averagerootMeanSquaredError();
    }
PlotRMSE plot = new PlotRMSE(trainValues, validationValues);

And here is my Config for neural net:

MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(seed)
            .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)//pridane
            .weightInit(WeightInit.XAVIER)
            .dropOut(0.6)
            .updater(new Adam(learningRate))
            .list()
            .layer(new DenseLayer.Builder().nIn(numInputs).nOut(numHiddenNodes1)
                    .activation(Activation.TANH).build())
            .layer(new DenseLayer.Builder().nIn(numHiddenNodes1).nOut(numHiddenNodes2)
                    .activation(Activation.TANH).build())
            .layer(new DenseLayer.Builder().nIn(numHiddenNodes2).nOut(numHiddenNodes2)
                    .activation(Activation.TANH).build())
            .layer(new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
                    .activation(Activation.IDENTITY)
                    .nIn(numHiddenNodes2).nOut(numOutputs).build()).backpropType(BackpropType.Standard)
            .build();
  • Are you sure you are comparing the right things? It looks like even without training, your test set performs better than the training set. – Paul Dubs Apr 30 '20 at 14:25
  • I think so. It is RMSE after each epoch for validation and training dataset. – Matej Cajka Apr 30 '20 at 14:57
  • Can you share some code to reproduce that behavior? What you are showing is *very* improbable to happen without some kind of bug in the setup. – Paul Dubs Apr 30 '20 at 15:39
  • @PaulDubs I've added it. – Matej Cajka Apr 30 '20 at 15:54
  • Unfortunately, that isn't even close to enough to reproduce that behavior. How are your iterators set up? And is the output you are seeing on console, in tune with what your plot shows? – Paul Dubs May 01 '20 at 07:38
  • @PaulDubs Yep, It does fit. Well, i ve do something that in most of my datasets its ok, that validation is bigger than training loss, but in some, they are very close but training is bigger. – Matej Cajka May 02 '20 at 08:30

0 Answers0