How to correctly train my Neural Network

Question

I'm trying to teach a neural network to decide where to go based on its inputted life level. The neural network will always receive three inputs [x, y, life]. If life => 0.2, it should output the angle from [x, y] to (1, 1). If life < 0.2, it should output the angle from [x, y] to (0, 0).

As the inputs and outputs of neurons should be between 0 and 1, I divide the angle by 2 *Math.PI.

Here is the code:

var network = new synaptic.Architect.Perceptron(3,4,1);

for(var i = 0; i < 50000; i++){
  var x = Math.random();
  var y = Math.random();
  var angle1 = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  var angle2 = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  for(var j = 0; j < 100; j++){
    network.activate([x,y,j/100]);
    if(j < 20){
      network.propagate(0.3, [angle1]);
    } else {
      network.propagate(0.3, [angle2]);
    }
  }
}

Try it out here: jsfiddle

So when I enter the following input [0, 1, 0.19], I expect the neural network to output something close to [0.75] (1.5PI / 2PI). But my results are completely inconsistent and show no correlation with any input given at all.

What mistake am I making in teaching my Neural network?

I have managed to teach a neural network to output 1 when input [a, b, c] with c => 0.2 and 0 when input [a, b, c] with c < 0.2. I have also managed to teach it to output an angle to a certain location based on [x, y] input, however I can't seem to combine them.

As requested, I have written some code that uses 2 Neural Networks to get the desired output. The first neural network converts life level to a 0 or a 1, and the second neural network outputs an angle depending on the 0 or 1 it got outputted from the first neural network. This is the code:

// This network outputs 1 when life => 0.2, otherwise 0
var network1 = new synaptic.Architect.Perceptron(3,3,1);
// This network outputs the angle to a certain point based on life
var network2 = new synaptic.Architect.Perceptron(3,3,1);

for (var i = 0; i < 50000; i++){
  var x = Math.random();
  var y = Math.random();
  var angle1 = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  var angle2 = angleToPoint(x, y, 1, 1) / (2 * Math.PI);

  for(var j = 0; j < 100; j++){
    network1.activate([x,y,j/100]);
    if(j < 20){
      network1.propagate(0.1, [0]);
    } else {
      network1.propagate(0.1, [1]);
    }
     network2.activate([x,y,0]);
    network2.propagate(0.1, [angle1]);
    network2.activate([x,y,1]);
    network2.propagate(0.1, [angle2]);
  }
}

Try it out here: jsfiddle

As you can see in this example. It manages to reach the desired output quite closely, by adding more iterations it will come even closer.

@cdm I tried, but did not make a difference. I'm trying to make my own network now by configuring individual layers. — Thomas Wagenaar, Feb 03 '17 at 11:14
either there is a problems with your layers, or using multiple neural network is a better solution. — Walfrat, Feb 03 '17 at 11:45
@Walfrat hmm seems to be the only solution. Ill try projecting networks to each other. — Thomas Wagenaar, Feb 03 '17 at 11:50
Otherwise you can check if your javascript library provide somthing else than neural network, for a result of type 0/1, a support vector machine (SVM) is way lighter than a neural network. — Walfrat, Feb 03 '17 at 12:08
@Richard I had it in the question earlier, but I'm almost certain that the mistake doens't lie there as I am able to do the job with 2 seperate neural networks. Check out the function at the jsfiddle link! — Thomas Wagenaar, Feb 04 '17 at 18:38
@ThomasW: Questions on SO should, as much as feasible, contain MWEs (minimum working examples) for the code which demonstrate the problems to be solved. This makes for quick reads and prevents important elements of a question from being lost when external links die. I'm happy to try my hand at solving your question, but won't do it through jsFiddle. — Richard, Feb 04 '17 at 18:42
I'd recommend that you change your training set. It is unbalanced and biased towards `life > 0.2`. Probably randomizing it also helps, because if you train sequentially your network ends up forgetting the initial inputs. — villasv, Feb 05 '17 at 11:01
It would also be easier for everybody else if you shared both your networks that work on the lesser tasks you said you solved. — villasv, Feb 05 '17 at 12:22
@ThomasW btw, "angle to (0,0)" is not defined. Did you mean "angle to (1,0)" as the angle to the `x` axis? — villasv, Feb 05 '17 at 13:12
@VillasV I have updated my post with a working example that uses 2 Neural Networks. I don't completely understand your comment about the angles, could you explain a little more? Angle to (0,0) is just the angle from the inputted (x,y) to (0,0) calculated by angleToPoint. — Thomas Wagenaar, Feb 05 '17 at 13:15
But there's no such thing as "angle to (0,0)". That function returns something, yes, but it can't mean what you describe. It returns the angle between (p1-p2) and the `x` axis. — villasv, Feb 05 '17 at 13:16
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/134898/discussion-between-villasv-and-thomas-w). — villasv, Feb 05 '17 at 13:19

villasv · Accepted Answer · 2017-02-05T16:18:37.840

Observations

Skewed Distribution sampled as Training set

Your training set is choosing the life parameter inside for(var j = 0; j < 100; j++), which is highly biased towards j>20 and consequently life>0.2. It has 4 times more training data for that subset, which makes your training function prioritize.
Non-shuffled training data

You are training sequentially against the life parameter, which can be harmful. You network will end up giving more attention to the bigger js since it's the most recent reason for network propagations. You should shuffle your training set to avoid this bias.

This will stack with the previous point, because you're again giving more attention to some subset of life values.
You should measure your training performance as well

Your network, despite previous observations, was not really that bad. Your training error was not as huge as your tests. This discrepancy usually means that you're training and testing on different sample distributions.

You could say that you have two classes of data points: the ones with life>0.2 and the others not. But because you introduced a discontinuity in the angleToPoint function, I'd recommend that you separate in three classes: keep a class for life<0.2 (because the function behaves continuously) and split life>0.2 in "above (1,1)" and "below (1,1)."
Network complexity

You could successfully train a network for each task separately. Now you want to stack them. This is quite the purpose of deep learning: each layer builds on the concepts perceived by the previous layer, therefore increasing the complexity of the concepts it can learn.

So instead of using 20 nodes in a single layer, I'd recommend that you use 2 layers of 10 nodes. This matches the classes hierarchy I mentioned in the previous point.

The Code

Running this code I had a training/testing error of 0.0004/0.0002.

https://jsfiddle.net/hekqj5jq/11/

var network = new synaptic.Architect.Perceptron(3,10,10,1);
var trainer = new synaptic.Trainer(network);
var trainingSet = [];

for(var i = 0; i < 50000; i++){
  // 1st category: above vector (1,1), measure against (1,1)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(x, 1.0);
  var z = getRandom(0.2, 1);
  var angle = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
  // 2nd category: below vector (1,1), measure against (1,1)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(0.0, x);
  var z = getRandom(0.2, 1);
  var angle = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
  // 3rd category: above/below vector (1,1), measure against (0,0)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(0.0, 1.0);
  var z = getRandom(0.0, 0.2);
  var angle = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
}

trainer.train(trainingSet, {
    rate: 0.1,
    error: 0.0001,
    iterations: 50,
    shuffle: true,
    log: 1,
    cost: synaptic.Trainer.cost.MSE
});

testSet = [
    {input: [0,1,0.25], output: [angleToPoint(0, 1, 1, 1) / (2 * Math.PI)]},
    {input: [1,0,0.35], output: [angleToPoint(1, 0, 1, 1) / (2 * Math.PI)]},
    {input: [0,1,0.10], output: [angleToPoint(0, 1, 0, 0) / (2 * Math.PI)]},
    {input: [1,0,0.15], output: [angleToPoint(1, 0, 0, 0) / (2 * Math.PI)]}
];

$('html').append('<p>Train:</p> ' + JSON.stringify(trainer.test(trainingSet)));
$('html').append('<p>Tests:</p> ' + JSON.stringify(trainer.test(testSet)));

$('html').append('<p>1st:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(0, 1, 1, 1) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([0, 1, 0.25]));

$('html').append('<p>2nd:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(1, 0, 1, 1) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([1, 0, 0.25]));

$('html').append('<p>3rd:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(0, 1, 0, 0) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([0, 1, 0.15]));

$('html').append('<p>4th:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(1, 0, 0, 0) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([1, 0, 0.15]));

function angleToPoint(x1, y1, x2, y2){
  var angle = Math.atan2(y2 - y1, x2 - x1);
  if(angle < 0){
    angle += 2 * Math.PI;
  }
  return angle;
}

function getRandom (min, max) {
    return Math.random() * (max - min) + min;
}

Further Remarks

As I mentioned in the comments and in the chat, there's no such a thing as "angle between (x,y) and (0,0)", because the notion of angle between vectors is usually taken as the difference between their directions and (0,0) has no direction.

Your function angleToPoint(p1, p2) returns instead the direction of (p1-p2). For p2 = (0,0), that means the angle between p1 and the x axis alright. But for p1=(1,1) and p2=(1,0) it will not return 45 degrees. For p1=p2, it's undefined instead of zero.

btw, I think this should be moved to the Cross Validated network, and there you might get even better answers and a few corrections to my analysis. — villasv, Feb 05 '17 at 14:40

How to correctly train my Neural Network

1 Answers1

Observations

The Code

Further Remarks