2

I have trained my neural network using encog 3.3 with MLP, resilientProp (trial since BackProp's learning rate and momentum is hard to set), 10 inputs (including the ideal value), 1 hidden layer with 7 neurons, 1 output neuron, sigmoid activation, training set is about 80k of rows, testing set is about 96 rows, with error rates of 0.01, 0.007 (I created 2 models but with 2 different error rates only and all the other settings mentioned above are the same). I've also done normalization of min-max on the data. Maybe my evaluation code is wrong? Or some parts of the code? Any help would be much appreciated.

FULL CODE:

public class ANN
{   
//training
//public final static String SQL = "SELECT load_input, day_of_week, weekend_day, type_of_day, week_num, time, day_date, month, year, ideal_value FROM sample WHERE (year,month,day_date,time) between (2012,4,1,1) and (2014,9,29, 96) ORDER BY ID";
//testing
public final static String SQL = "SELECT load_input, day_of_week, weekend_day, type_of_day, week_num, time, day_date, month, year, ideal_value FROM sample WHERE (year,month,day_date,time) between (2014,9,30,1) and (2014,9,30, 92) ORDER BY ID";
//validation
//public final static String SQL = "SELECT load_input, day_of_week, weekend_day, type_of_day, week_num, time, day_date, month, year, ideal_value FROM sample WHERE (year,month,day_date,time) between (2014,9,30,93) and (2014,9,30, 96) ORDER BY ID";
public final static int INPUT_SIZE = 9;
public final static int IDEAL_SIZE = 1;
public final static String SQL_DRIVER = "org.postgresql.Driver";
public final static String SQL_URL = "jdbc:postgresql://localhost/ANN";
public final static String SQL_UID = "postgres";
public final static String SQL_PWD = "";

public static void main(String args[])
{   
    Mynetwork();
    //train network. will add customizable params later.
    //train(trainingData());
    //evaluate network
    evaluate(trainingData());
    Encog.getInstance().shutdown();
}
public static void evaluate(MLDataSet testSet)
{
    BasicNetwork network = (BasicNetwork)EncogDirectoryPersistence.loadObject(new File("directory"));

    // test the neural network
    System.out.println("Neural Network Results:");
    for(MLDataPair pair: testSet ) {
        final MLData output = network.compute(pair.getInput());
        System.out.println(pair.getInput().getData(0) + "," + pair.getInput().getData(1) + "," + pair.getInput().getData(2) + "," + pair.getInput().getData(3) + "," + pair.getInput().getData(4) + "," + pair.getInput().getData(5) + "," + pair.getInput().getData(6) + "," + pair.getInput().getData(7) + "," + pair.getInput().getData(8) + "," + "Predicted=" + output.getData(0) + ", Actual=" + pair.getIdeal().getData(0));
    }
}
public static BasicNetwork Mynetwork()
{
    //basic neural network template. Inputs should'nt have activation functions
    //because it affects data coming from the previous layer and there is no previous layer before the input.
    BasicNetwork network = new BasicNetwork();
    //input layer with 2 neurons.
    //The 'true' parameter means that it should have a bias neuron. Bias neuron affects the next layer.
    network.addLayer(new BasicLayer(null , true, 9));
    //hidden layer with 3 neurons
    network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 5));
    //output layer with 1 neuron
    network.addLayer(new BasicLayer(new ActivationSigmoid(), false, 1));
    network.getStructure().finalizeStructure() ;
    network.reset();

    return network;
}
public static void train(MLDataSet trainingSet)
{
    //Backpropagation(network, dataset, learning rate, momentum)
    //final Backpropagation train = new Backpropagation(Mynetwork(), trainingSet, 0.1, 0.9);
    final ResilientPropagation train = new ResilientPropagation(Mynetwork(), trainingSet);
    //final QuickPropagation train = new QuickPropagation(Mynetwork(), trainingSet, 0.9);

    int epoch = 1;

    do {
        train.iteration();
        System.out.println("Epoch #" + epoch + " Error:" + train.getError());
        epoch++;
    } while((train.getError() > 0.01)); 
    System.out.println("Saving network");
    System.out.println("Saving Done");
    EncogDirectoryPersistence.saveObject(new File("directory"), Mynetwork());
}
public static MLDataSet trainingData()
{
    MLDataSet trainingSet = new SQLNeuralDataSet(
            ANN.SQL,
            ANN.INPUT_SIZE,
            ANN.IDEAL_SIZE,
            ANN.SQL_DRIVER,
            ANN.SQL_URL,
            ANN.SQL_UID,
            ANN.SQL_PWD);

    return trainingSet;
}

}

Here is my result:

Predicted=0.4451817588640455, Actual=0.5260616667545941
Predicted=0.4451817588640455, Actual=0.5196499668339777
Predicted=0.4451817588640455, Actual=0.5083828048375548
Predicted=0.4451817588640455, Actual=0.49985462144799725
Predicted=0.4451817588640455, Actual=0.49085956670499675
Predicted=0.4451817588640455, Actual=0.485008112408512
Predicted=0.4451817588640455, Actual=0.47800504210686795
Predicted=0.4451817588640455, Actual=0.4693212349328293
(...and so on with the same "predicted")

Results im expecting (I changed the "predicted" with something random for demonstration purposes, indicating that the network is actually predicting):

Predicted=0.4451817588640455, Actual=0.5260616667545941
Predicted=0.5123312331212122, Actual=0.5196499668339777
Predicted=0.435234234234254365, Actual=0.5083828048375548
Predicted=0.673424556563455, Actual=0.49985462144799725
Predicted=0.2344673345345544235, Actual=0.49085956670499675
Predicted=0.123346457544324, Actual=0.485008112408512
Predicted=0.5673452342342342, Actual=0.47800504210686795
Predicted=0.678435234423423423, Actual=0.4693212349328293

UPDATE:

Printed inputs + Predicted + Actual (ideal). They are separated by comma

0.5386671932975533,1100000.0,0.0,1.0,40.0,1.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.5260616667545941
0.5260616667545941,1100000.0,0.0,1.0,40.0,2.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.5196499668339777
0.5196499668339777,1100000.0,0.0,1.0,40.0,3.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.5083828048375548
0.5083828048375548,1100000.0,0.0,1.0,40.0,4.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.49985462144799725
0.49985462144799725,1100000.0,0.0,1.0,40.0,5.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.49085956670499675
0.49085956670499675,1100000.0,0.0,1.0,40.0,6.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.485008112408512
0.485008112408512,1100000.0,0.0,1.0,40.0,7.0,30.0,9.0,2014.0,Predicted=0.4451817588640455, Actual=0.47800504210686795
codex
  • 432
  • 2
  • 7
  • 18
  • not sure if it's going to work but just for the sake of convenience, try to remove "final" keyword in front of "MLData output" – mangusta Sep 24 '18 at 14:07
  • @mangusta still doesn't work :/ – codex Sep 24 '18 at 14:12
  • are you sure that the input values of your testSet are not the same? – mangusta Sep 24 '18 at 14:19
  • @mangusta yes i am certain. The "Actual=0...." above is my testSet. – codex Sep 24 '18 at 14:21
  • 1
    "Actual=" is the return value of method "getIdeal", how about the other values returned by "getInput"? The testSet itself is a collection of "MLDataPair" objects, which in turn consists of two "MLData" objects each having a double [ ] array (according to spec) – mangusta Sep 24 '18 at 14:22
  • @mangusta I've posted an update for the testSet. – codex Sep 24 '18 at 14:33
  • according to your code, the ideal value is located at index=0 of array inside "pair.getIdeal()". what about input? which index should it be at? while the location of ideal value is not important for model, the location of input value is important, so my suspicion is that the model looks at some specific comma-separated position in the input line provided by you, and since all the values at that position are the same (like 1100000.0 or 0.0 or 1.0 or 40.0 or 30.0 or 9.0 or 2014.0), the model surely generates the same output for them – mangusta Sep 24 '18 at 14:47
  • so you probably need to place your input value into position other than it is currently placed – mangusta Sep 24 '18 at 14:48
  • @mangusta I should probably just post my entire code for clarity. Im taking my inputs from a database. What do you mean by place my input value into position other than it is currently placed? – codex Sep 24 '18 at 15:02
  • @mangusta I've updated the question and posted my entire code. I've also updated the code for printing the inputs so you can see where they are placed at. – codex Sep 24 '18 at 15:06
  • 1
    So in your SQL query, you place your input values at position=0, then comes day of week at position=1, then comes weekend day at position=2, and so on. The SQLNeuralDataSet constructor relies on your SQL query only, i.e. it is only you who choose the position of input data, so probably the position is not 0, maybe it should be 1 or 2 or whatever. And by the way, what does "INPUT_SIZE=9" mean? I see 9 values in SQL query + 1 ideal value. So which value out of 9 is used by model to predict ideal value? The first one only? – mangusta Sep 24 '18 at 15:16
  • @mangusta That I am not sure of sorry. I just started Encog a week ago and I assumed that the model itself determines them from the inputs. Thank you for pointing that out. – codex Sep 24 '18 at 15:29
  • Yes, I think that the training and test data sets were incorrectly formatted, i.e. they had input values misplaced. My suggestion is, try to use the following SQL query "SELECT load_input, ideal_value FROM ..." and specify "INPUT_SIZE=1" – mangusta Sep 24 '18 at 15:33
  • @mangusta please dont take offense but I dont think so. From what I know, training the network with the combination of inputs should yield close to the ideal value provided each row. It doesnt matter which variable the network is trying to predict so as long as if the 9 combinations appear, it should be able to predict close to the ideal value provided. – codex Sep 24 '18 at 16:11
  • oh, well, I don't know the specifics of the library you're using, you might be right too. still you may just try to use a single input value for convenience, if that works then the issue is the format of the input data – mangusta Sep 24 '18 at 16:25
  • @mangusta I tried training the single value using the load but it wont coverge since it has no other attributes to find sets of combination. Also tried to test using the single value whilst trained with 9 inputs but it wont let me because of a mismatched number of inputs error. – codex Sep 25 '18 at 01:44
  • maybe it makes sense to use another dataset. you're using "MLDataSet", it seems to be a parent of several more specialized datasets, for example "BasicMLDataSet" – mangusta Sep 25 '18 at 09:41
  • @Karl The problem statement is unclear. Please describe the input structure and what you want to calculate with the network. Furthermore: how is the calculation result a function from the input which the network shall approximate? – mtj Sep 26 '18 at 04:55

1 Answers1

0

Fixed it by thoroughly normalizing all the input features. I was thinking that it's already enough to normalize the main input you're trying to predict and leave the factors that affect it as is.

codex
  • 432
  • 2
  • 7
  • 18