Encog Lunar Lander Extended

Question

This question is with reference to C#'s Lunar Lander Example obtained in Encog repository. As the example suggests, I am using NeuralSimulatedAnnealing to train my multi-layer feedforward network (50 epoch's)

BasicNetwork network = CreateNetwork();

IMLTrain train;
train = new NeuralSimulatedAnnealing(network, new PilotScore(), 10, 2, 100);

_

public static BasicNetwork CreateNetwork() {
    var pattern = new FeedForwardPattern {InputNeurons = 3};
    pattern.AddHiddenLayer(50);
    pattern.OutputNeurons = 1;
    pattern.ActivationFunction = new ActivationTANH();
    var network = (BasicNetwork) pattern.Generate();
    network.Reset();
    return network;
}

The example works great and neural pilot exactly learns how to land the spaceship in given conditions, however I want something more out of it!

To do that I created a class globals such as below and also modified a line in LanderSimulator class

namespace Encog.Examples.Lunar
{
    class globals
    {
        public static int fuelConsumption { get; set; }
    }
}

_

 public void Turn(bool thrust){
    Seconds++;
    Velocity -= Gravity;
    Altitude += Velocity;

    if (thrust && Fuel > 0)
    {
        Fuel-= globals.fuelConsumption;    //changed instead of Fuel--;
        Velocity += Thrust;
    }

    Velocity = Math.Max(-TerminalVelocity, Velocity);
    Velocity = Math.Min(TerminalVelocity, Velocity);

    if (Altitude < 0)
        Altitude = 0;
}

So now depending upon the fuelConsumption variable the fuel is consumed on each thrust. Then I tried with three different values of fuelConsumption and following were the respective best scores for individual networks:

//NETWORK 1
globals.fuelConsumption = 1;
bestScore: 7986

//NETWORK 2
globals.fuelConsumption = 5;
bestScore: 7422

//NETWORK 3
globals.fuelConsumption = 10;
bestScore: 6921

When I tested these networks on each other the results were disappointing:

network 1 showed score of -39591 and -39661 when fuelConsumed was 5 and 10 respectively.
network 2 showed score of -8832 and -35671 when fuelConsumed was 1 and 10 respectively.
network 3 showed score of -24510 and -19697 when fuelConsumed was 1 and 5 respectively.

So I tried to train one single network for all three scenarios like below:

int epoch;

epoch = 1;
globals.fuelConsumption = 1;
for (int i = 0; i < 50; i++){
    train.Iteration();
    Console.WriteLine(@"Epoch #" + epoch + @" Score:" + train.Error);
    epoch++;
}
Console.WriteLine("--------------------------------------");

epoch = 1;
globals.fuelConsumption = 5;
for (int i = 0; i < 50; i++){
    train.Iteration();
    Console.WriteLine(@"Epoch #" + epoch + @" Score:" + train.Error);
    epoch++;
}
Console.WriteLine("--------------------------------------");
epoch = 1;
globals.fuelConsumption = 10;
for (int i = 0; i < 50; i++){
    train.Iteration();
    Console.WriteLine(@"Epoch #" + epoch + @" Score:" + train.Error);
    epoch++;
}

Console.WriteLine(@"The score of experienced pilot is:");
network = (BasicNetwork) train.Method;

var pilot = new NeuralPilot(network, false);
globals.fuelConsumption = 1;
Console.WriteLine("@1: " + pilot.ScorePilot());
globals.fuelConsumption = 5;
Console.WriteLine("@5: " + pilot.ScorePilot());
globals.fuelConsumption = 10;
Console.WriteLine("@10: " + pilot.ScorePilot());

But results are again the same

The score of experienced pilot is:
@1: -27485
@5: -27565
@10: 7448

How do I create a neural pilot that would deliver me the best score in all three scenarios?

score 0 · Answer 1 · answered Feb 15 '16 at 18:13

In order to solve this puzzle I switched to NEAT networks rather than using traditional feed forward or recurrent networks. Here were some interesting changes in the code..

NEATPopulation network = CreateNetwork();
TrainEA train = default(TrainEA);

_

public static NEATPopulation CreateNetwork(){
    int inputNeurons = 3;
    int outputNeurons = 1;
    NEATPopulation network = new NEATPopulation(inputNeurons, outputNeurons, 100);
    network.Reset();
    return network;
}

And then after tweaking some parameters in NeuralPilot class,

private readonly NEATNetwork _network;

public NeuralPilot(NEATNetwork network, bool track)

I had to make change in ScorePilot function since NEATNetoworks use SteepenedSigmoidActivation by default rather than traditional ActivationLinear or ActivatonTanH on outputs

bool thrust;

if (value > 0.5){       //changed from, if (value > 0){
    thrust = true;
    if (_track)
        Console.WriteLine(@"THRUST");
}
else
    thrust = false;

So now training one single network looks like below:

OriginalNEATSpeciation speciation = default(OriginalNEATSpeciation);
speciation = new OriginalNEATSpeciation();

int epoch;
double best_1, best_5, best_10;
best_1 = best_5 = best_10 = 0;

train = NEATUtil.ConstructNEATTrainer(network, new PilotScore());
train.Speciation = speciation;

epoch = 1;
globals.fuelConsumption = 1;
for (int i = 0; i < 50; i++){
    train.Iteration();
    Console.WriteLine(@"Epoch #" + epoch + @" Score:" + train.Error);
    best_1 = train.Error;
    epoch++;
}
Console.WriteLine("--------------------------------------");

train = NEATUtil.ConstructNEATTrainer(network, new PilotScore());
train.Speciation = speciation;

epoch = 1;
globals.fuelConsumption = 5;
for (int i = 0; i < 50; i++){
    train.Iteration();
    Console.WriteLine(@"Epoch #" + epoch + @" Score:" + train.Error);
    best_5 = train.Error;
    epoch++;
}
Console.WriteLine("--------------------------------------");

train = NEATUtil.ConstructNEATTrainer(network, new PilotScore());
train.Speciation = speciation;

epoch = 1;
globals.fuelConsumption = 10;
for (int i = 0; i < 50; i++){
    train.Iteration();
    Console.WriteLine(@"Epoch #" + epoch + @" Score:" + train.Error);
    best_10 = train.Error;
    epoch++;
}

Console.WriteLine(@"The score of experienced pilot is:");

NEATNetwork trainedNetwork = default(NEATNetwork);
trainedNetwork = (NEATNetwork)train.CODEC.Decode(network.BestGenome);

var pilot = new NeuralPilot(trainedNetwork, false);
globals.fuelConsumption = 1;
Console.WriteLine("@bestScore of " + best_1.ToString() +" @1: liveScore is " + pilot.ScorePilot());
globals.fuelConsumption = 5;
Console.WriteLine("@bestScore of " + best_5.ToString() + " @5: liveScore is " + pilot.ScorePilot());
globals.fuelConsumption = 10;
Console.WriteLine("@bestScore of " + best_10.ToString() + " @10: liveScore is " + pilot.ScorePilot());

The results go dicey! Following are some results on random tests:

The score of experienced pilot is:
@bestScore of 5540 @1: liveScore is -4954
@bestScore of 1160 @5: liveScore is 3823
@bestScore of 3196 @10: liveScore is 3196

The score of experienced pilot is:
@bestScore of 7455 @1: liveScore is 8227
@bestScore of 6324 @5: liveScore is 7427
@bestScore of 6427 @10: liveScore is 6427

The score of experienced pilot is:
@bestScore of 5322 @1: liveScore is -4617
@bestScore of 1898 @5: liveScore is 9531
@bestScore of 2086 @10: liveScore is 2086

The score of experienced pilot is:
@bestScore of 7493 @1: liveScore is -3848
@bestScore of 4907 @5: liveScore is -13840
@bestScore of 4954 @10: liveScore is 4954

The score of experienced pilot is:
@bestScore of 6560 @1: liveScore is 4046
@bestScore of 5775 @5: liveScore is 3366
@bestScore of 2516 @10: liveScore is 2516

As you can see we did managed to get positive score all the way in second scenario but, there doesn't seem to be any relation between the final network performance and the initial best score values. Hence, the issue might be resolved but not in satisfactory manner.

note that same **network** and **speciation** are used successively but over new **trainer** each time. Is it logically correct? — dexterslab, Feb 15 '16 at 18:26

Encog Lunar Lander Extended

1 Answers1