3

I've just created my first neural net, which uses gradient method and back propagation learning algorithm. It uses hyperbolic tangent as activation function. The code is well unit tested so I was full of good hopes that the net will actually work. Then I decided to create an integration test and try to teach my net to solve some very simple functions. Basically I'm testing if the weight improves (there's only one, as this is a very small net - input plus one neuron).

    // Combinations of negative sign for values greater than 1
    [TestCase(8, 4)] // FAIL reason 1
    [TestCase(-8, 4)] // FAIL reason 1
    [TestCase(8, -4)] // FAIL reason 1
    [TestCase(-8, -4)] // FAIL reason 1
    // Combinations of negative sign for values lesser than 1
    [TestCase(.8, .4)] // OK
    [TestCase(-.8, .4)] // FAIL reason 2
    [TestCase(.8, -.4)] // FAIL reason 2
    [TestCase(-.8, -.4)] // OK
    // Combinations of negative sign for one value greater than 1 and the other value lesser than 1
    [TestCase(-.8, 4)] // FAIL reason 2
    [TestCase(8, -.4)] // FAIL reason 2
    // Combinations of one value greater than 1 and the other value lesser than 1
    [TestCase(.8, 4)] // OK
    [TestCase(8, .4)] // FAIL reason 1
    public void ShouldImproveLearnDataSetWithNegativeExpectedValues(double expectedOutput, double x)
    {
        var sut = _netBuilder.Build(1, 1); // one input, only one layer with one output
        sut.NetSpeedCoefficient = .9;

        for (int i = 0; i < 400; i++)
        {
            sut.Feed(new[] { x }, new[] { expectedOutput });
        }

        var postFeedOutput = sut.Ask(new[] { x }).First();
        var postFeedDifference = Math.Abs(postFeedOutput - expectedOutput);
        postFeedOutput.Should().NotBe(double.NaN);
        postFeedDifference.Should().BeLessThan(1e-5);
    }

I was very disappointed, because most of the test cases failed (only 3 marked with '// OK' passed). I dug into the code and found out some interesting facts.

  1. The hyperbolic tangent max value is 1. So no matter how big the sum of weight * input is, the neuron's output absolute value will always be <= 1. In other words the net will never learn to solve a function if it's return absolute value is greater than 1. That explains all failures of test cases with 8, -8 expected outputs.
  2. In test cases where one of the numbers is negative the final weight should also be negative. Firstly it decreases, but it would never become negative. It either stops around 0 or jumps back and forth around 0.

Is neural net only capable of solving problems with 0..1 input values and 0..1 expected output values or is there something wrong with my implementation?

Andrzej Gis
  • 13,706
  • 14
  • 86
  • 130

1 Answers1

2

You can have other outputs from NN. If you want discrete output (classification), use Softmax Regression. Instead if you want continuous output (regression) then you have to create a bijective map between range of your output (min, max) and (0,1). In most cases, the map f:(min,max)->(0,1), f(x) = (x-min)/(max-min) is sufficient.

In test cases where one of the numbers is negative the final weight should also be negative

Why final weight should also be negative?

You can have any numbers as input. (Though it is good practice to normalize the features to smaller ranges, usually by making them to have mean 0 and standard deviation 1)

Corei13
  • 401
  • 2
  • 9
  • I mean that if I only have one weight (this is a very small net) than the only way to turn a negative input to a positive output is when the weight turns negative. In my net the weight would fluctuate around 0, but would never do significantly negative. – Andrzej Gis Jul 21 '14 at 08:38
  • What is the range of your input and output values defined and what is the mapping rule (I mean, if the input is 8 what should the output be I dont understand?) – Ranic Jul 21 '14 at 14:19
  • There is one neuron in this net and it has one weight to the only input. Firstly the weight gets initialized with a random 0..1 number - let's call it 'X'. Then I try to teach the net to output -0.8 when I put 0.4 to the input. Firstly the output is 0.4*X (remember, X is positive). Over iterations the X weight constantly decreases (which is good) until it reaches 0. Than it fluctuates (which is bad) around positive and negative side of zero and it would never get near -2. (0.4 * (-2) = -0.8). I'm using this alg. http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html – Andrzej Gis Jul 21 '14 at 18:58