3

I have build my own neural net and I have a weird problem with it.

The net is quite a simple feed-forward 1-N-1 net with back propagation learning. Sigmoid is used as activation function.

My training set is generated with random values between [-PI, PI] and their [0,1]-scaled sine values (This is because the "Sigmoid-net" produces only values between [0,1] and unscaled sine -function produces values between [-1,1]).

With that training-set, and the net set to 1-10-1 with learning rate of 0.5, everything works great and the net learns sin-function as it should. BUT.. if I do everything exately the same way for COSINE -function, the net won't learn it. Not with any setup of hidden layer size or learning rate.

Any ideas? Am I missing something?

EDIT: My problem seems to be similar than can be seen with this applet. It won't seem to learn sine-function unless something "easier" is taught for the weights first (like 1400 cycles of quadratic function). All the other settings in the applet can be left as they initially are. So in the case of sine or cosine it seems that the weights need some boosting to atleast partially right direction before a solution can be found. Why is this?

Simo Erkinheimo
  • 1,347
  • 9
  • 17
  • 1
    What does it learn instead? That may help us guess where the problem lies. – Phil H Nov 19 '12 at 12:10
  • I noticed that also sin(x+0.5*PI) (which is equal to cos x) doesn't work. Also power-functions (^2, ^3, etc) doesn't seem to work. Linear -functions do work. There's probaply some some issues in the code and success of sin[-PI, PI] -function is just a weird side-effect. – Simo Erkinheimo Nov 19 '12 at 12:48
  • posting some code might help us help you. – g19fanatic Nov 19 '12 at 13:07

2 Answers2

1

I'm struggling to see how this could work.

You have, as far as I can see, 1 input, N nodes in 1 layer, then 1 output. So there is no difference between any of the nodes in the hidden layer of the net. Suppose you have an input x, and a set of weights wi. Then the output node y will have the value:

y = Σi w_i x

  = x . Σi w_i

So this is always linear.

In order for the nodes to be able to learn differently, they must be wired differently and/or have access to different inputs. So you could supply inputs of the value, the square root of the value (giving some effect of scale), etc and wire different hidden layer nodes to different inputs, and I suspect you'll need at least one more hidden layer anyway.

The neural net is not magic. It produces a set of specific weights for a weighted sum. Since you can derive a set weights to approximate a sine or cosine function, that must inform your idea of what inputs the neural net will need in order to have some chance of succeeding.

An explicit example: the Taylor series of the exponential function is:

exp(x) = 1 + x/1! + x^2/2! + x^3/3! + x^4/4! ...

So if you supplied 6 input notes with 1, x1, x2 etc, then a neural net that just received each input to one corresponding node, and multiplied it by its weight then fed all those outputs to the output node would be capable of the 6-term taylor expansion of the exponential:

in     hid     out

1 ---- h0 -\
x   -- h1 --\
x^2 -- h2 ---\
x^3 -- h3 ----- y
x^4 -- h4 ---/
x^5 -- h5 --/

Not much of a neural net, but you get the point.

Further down the wikipedia page on Taylor series, there are expansions for sin and cos, which are given in terms of odd powers of x and even powers of x respectively (think about it, sin is odd, cos is even, and yes it is that straightforward), so if you supply all the powers of x I would guess that the sin and cos versions will look pretty similar with alternating zero weights. (sin: 0, 1, 0, -1/6..., cos: 1, 0, -1/2...)

Phil H
  • 19,928
  • 7
  • 68
  • 105
  • 1
    I'm not too good with NN just yet, but I must disagree. The backpropagation -learning method adjusts all the weights of the net so that some "node-paths" for current input will be adjusted more torwards the correct value than other paths. Doing this for multiple samples will eventually lead some hidden nodes to be more receptive to some input-values and some other hidden nodes for others. Sum of those nodes + activation function at the output can then learn to be any value (in the scope of act.func) for any input assuming there's enough nodes in the hidden layer. – Simo Erkinheimo Nov 19 '12 at 18:57
0

I think you can always compute sine and then compute cosine externally. I think your concern here is why the neural net is not learning the cosine function when it can learn the sine function. Assuming that this artifact if not because of your code; I would suggest the following:

  1. It definitely looks like an error in the learning algorithm. Could be because of your starting point. Try starting with weights that gives the correct result for the first input and then march forward.
  2. Check if there is heavy bias in your learning - more +ve than -ve
  3. Since cosine can be computed by sine 90 minus angle, you could find the weights and then recompute the weights in 1 step for cosine.
Tim Malone
  • 3,364
  • 5
  • 37
  • 50