4

so I am very new to python and pybrain but I found a code online and ran my own data on it. When i look at the python shell all I see is

Total error: 0.119794950183
Total error: 0.120078064472
Total error: 0.119334171755
Total error: 0.119215954708
Total error: 0.119876371059
Total error: 0.119621091587
Total error: 0.119983293587
Total error: 0.119849213731
Total error: 0.119638354788
Total error: 0.119574076444
Total error: 0.119634911454
Total error: 0.119601510912
Total error: 0.119665039573
Total error: 0.11944303853
Total error: 0.11950617361
Total error: 0.120088611572
Total error: 0.119774446939
Total error: 0.120016814426
Total error: 0.119605678505
Total error: 0.119998864263
Total error: 0.120071472045
Total error: 0.11973079242
Total error: 0.119790825048
Total error: 0.119558913137
Total error: 0.12024443015
Total error: 0.119525196587
Total error: 0.12008456943
Total error: 0.119641361568
Total error: 0.119745707444
Total error: 0.120065315199

1) what does total error mean and what is it doing

here is the code

from pybrain.datasets import SupervisedDataSet
from pybrain.tools.shortcuts import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer

from pybrain.datasets            import ClassificationDataSet
from pybrain.utilities           import percentError
from pybrain.tools.shortcuts     import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure.modules   import SoftmaxLayer

from pylab import ion, ioff, figure, draw, contourf, clf, show, hold, plot
from scipy import diag, arange, meshgrid, where
from numpy.random import multivariate_normal

ds = SupervisedDataSet(2,1)

tf = open('weather.csv','r')

for line in tf.readlines():
    try:
        data = [float(x) for x in line.strip().split(',') if x != '']
        indata =  tuple(data[:2])
        outdata = tuple(data[2:])
        ds.addSample(indata,outdata)
    except ValueError,e:
            print "error",e,"on line"

n = buildNetwork(ds.indim,8,8,ds.outdim,recurrent=True)
t = BackpropTrainer(n,learningrate=0.01,momentum=0.5,verbose=True)
t.trainOnDataset(ds,5000)
t.testOnData(verbose=True)
ben olsen
  • 663
  • 1
  • 14
  • 27
  • so where is your code ? – Mazdak Sep 21 '14 at 13:35
  • Are you fitting a model of some kind with PyBrain? For example, if you are fitting a regression model, this might be the sum of squared errors, which is sometimes called the total error. It's probably an error term that is common to whatever model class you are trying to fit. – ely Sep 21 '14 at 13:44
  • Also the weather.csv file is only 1 or 0. with 2 inputs and 1 output the out is if it will rain or not. ex: rain =1 no rain =0 – ben olsen Sep 21 '14 at 13:55

1 Answers1

3

To answer your question about "What does it tell us": looking at the total error can give you a decent guess as to whether your network could do better if it were given longer to train. If the total error changed a lot from one iteration to the next, that would be a sign that it hadn't settled down to a final state. It might be worth looking at this link:

http://pybrain.org/docs/api/supervised/trainers.html

There you will see reference to trainEpochs and trainUntilConvergence. The totalerror you're seeing suggests (although it doesn't prove) that your network had converged on a final state and wouldn't improve much with additional training.

In summary, if the totalError your seeing looks pretty stable by the time it stops training, which in your case it does, you probably don't have to worry about it. Just look at the test output and decide if your network is doing a job adequate for your purpose.

rossdavidh
  • 1,966
  • 2
  • 22
  • 33
  • is there a way to code into python to find percentage correct. for example the output can either be 1 or 0. if the ANN says .58 predicted and the actual answer is 1. I want python to say that the ANN DID get it correct because its about .50... so anything ANN predicts above .50 and the actual is 1 it should say the ann was correct. and vice versa. I hope this makes sense. – ben olsen Sep 21 '14 at 19:37
  • 1
    Well, I suppose you could use this in a loop: " train() Train the associated module for one epoch." ...and then check all cases to see if you're done. But you could also try modeling it as a classification network (with two outputs for two classes, which we interpret as 1 or 0). Here's an example of that: http://www.pybrain.org/docs/_sources/tutorial/fnn.txt – rossdavidh Sep 22 '14 at 14:10
  • 1
    Thank you for your help Rossdavidh :) – ben olsen Sep 22 '14 at 16:41