0

I'm trying to program a neural network with backpropagation in python.

Usually converges to 1. To the left of the image there are some delta values. They are very small, should they be larger? Do you know a reason why this converging could happen?

sometimes it goes up in the direction of the point and then goes down again

here is the complete code: http://pastebin.com/9BiwhWrD the backpropagation code starts at line 146 (the root stuff in line 165 does nothing. was just trying out some ideas)

Any ideas of what could be wrong? Have you ever seen a behaviour like this?

Thanks you very much.

sezanzeb
  • 816
  • 8
  • 20
  • i wish i could show you an image of some cases where it does not converge immediately but creates some fancy 3D rainbow patterns. In those cases the delta values are between 1 and 0.1 But I can't post more than 2 links – sezanzeb Jan 07 '17 at 13:33
  • You should add more information: What does the point mean? What do the colors mean? What is your topology / model? – Martin Thoma Jan 07 '17 at 16:54
  • Please [create a MCVE](http://stackoverflow.com/help/mcve) and post the code here. – Martin Thoma Jan 07 '17 at 16:55
  • http://pastebin.com/FQ4mkyKF – sezanzeb Jan 07 '17 at 20:22
  • it for sure is more minimal than before. It now has always one hidden layer with 3 nodes and it only supports one training input and output. The black line is the starting point with randomized weights. It then continues in red over yellow, green, blue, purple, red, yellow etc. to show how the function changes over time, when backprop is applied. I followed this tutorial for the formulas: https://www.youtube.com/watch?v=zpykfC4VnpM – sezanzeb Jan 07 '17 at 20:23
  • I solved the problem, @MartinThoma. See answer – sezanzeb Jan 09 '17 at 21:52

1 Answers1

0

The reason why this happened is, because the input data was too large. The activation sigmoid function converged to f(x)=1 for x -> inf. I had to normalize the data

e.g.:

a = np.array([1,2,3,4,5])
a /= a.max()

or prevent generating unnormalized data at all.

Also, the interims value was updated BEFORE the sigmoid was applied. But the derivation of sigmoid looks like this: y'(x) = y(x)-(1-y(x)). In my case it was just: y'(x) = x-(1-x)

There were also errors in how i updated the weights after calculating the deltas. I rewrote the whole loop using a tutorial for neural networks with python and then it worked.

It still does not support bias but it can do classification. For regression it's not precise enough, but i guess this has to do with the missing bias.

Here is the code: http://pastebin.com/hRCKe1dK

Someone suggested that i should put my training-data into a neural-network framework and see if it works. It didn't. So it was kindof clear that it had to to with it and so i had to the idea that it should be between -1 and 1.

sezanzeb
  • 816
  • 8
  • 20