I am trying to implement a Neural Network. I am currently working on the backpropagation part. I don't need help with the coding. I have wrote the feedForward part so far with great success. But my question more related to the dataset I am using.
this is my data set:
http://archive.ics.uci.edu/ml/machine-learning-databases/cpu-performance/machine.data
I have to use a 5-fold cross validation to backpropagate and stop training until error threshold is .1
, .01
, .001
meaning I have to do 3 trials. Ignore the first 2 point for each data point. I have already normalized the set. The architecture of the network is 7 neuron in input layer, 3 neurons in 1 hidden layer and 1 output. Very basic implementation.
So my question is,
I have to break the set up in 5 smaller subset right? keep one for testing and rest for validation and training right?
how long do I train? say I use 1 fold (about 42 data sets per fold) and i reach my desired error threshold. Do I stop? and use the test data? else I load up the next set and keep training? what if I run out of data set before I reach my desired error threshold?
also should I follow something like this?
- use input values
- feed forward, then compare error
- backpropagate and adjust weights
- repeat process a-c until i reach error threshold and finish all data in the current fold?
thank you for your time and response. I am really trying to understand how to use the dataset. I will update you guys once I have written the code.