0

I'm an ML.NET newbie and want to learn more about ML.NET by solving the XOR problem. This is what I've come up with so far, but the output always appears to be the same (zero), regardless of input.

No doubt I've made a rookie mistake, but what?

using Microsoft.ML.Legacy;
using Microsoft.ML.Legacy.Data;
using Microsoft.ML.Legacy.Models;
using Microsoft.ML.Legacy.Trainers;
using Microsoft.ML.Legacy.Transforms;
using Microsoft.ML.Runtime.Api;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using Microsoft.ML.Runtime; 

public class Program
{
    static void Main(string[] args)
    {
        MlNet.Solve();
        Console.ReadLine();
    }
}

Am I using a suitable regressor (StochasticDualCoordinateAscentRegressor)?

public class MlNet
{
    public static void Solve()
    {
        var data = new List<Input>
        {
            new Input {Input1 = 0.0f, Input2 = 0.0f, Output = 0.0f},
            new Input {Input1 = 0.0f, Input2 = 1.0f, Output = 1.0f},
            new Input {Input1 = 1.0f, Input2 = 0.0f, Output = 1.0f},
            new Input {Input1 = 1.0f, Input2 = 1.0f, Output = 0.0f}
        };

        var largeSet = Enumerable.Repeat(data, 1000).SelectMany(a => a).ToList();
        var dataSource = CollectionDataSource.Create(largeSet.AsEnumerable());
        var pipeline = new LearningPipeline
        {
            dataSource,
            new ColumnConcatenator("Features", "Input1", "Input2"),
            new StochasticDualCoordinateAscentRegressor
            {
                LossFunction = new SquaredLossSDCARegressionLossFunction(),
                MaxIterations = 500,
                BiasLearningRate = 0.2f,
                Shuffle = true
            }
        };

        var model = pipeline.Train<Input, Prediction>();
        var evaluator = new RegressionEvaluator();
        var metrics = evaluator.Evaluate(model, dataSource);

        Console.WriteLine($"Accuracy: {Math.Round(metrics.Rms, 2)}");

        var prediction = model.Predict(new Input { Input1 = 0.0f, Input2 = 1.0f });

        Console.WriteLine($"Prediction: {prediction.Output}");
    }


    [DebuggerDisplay("Input1={Input1}, Input2={Input2}, Output={Output}")]
    public class Input
    {
        [Column("0", "Input1")] public float Input1 { get; set; }

        [Column("1", "Input2")] public float Input2 { get; set; }

        [Column("2", "Label")] public float Output { get; set; }
    }

    public class Prediction
    {
        [ColumnName("Label")] public float Output { get; set; }
    }
}
  • 1
    Please provide a Complete, Minimal Verifyable example. I can not even get this code to compile, propably because you use very scenario specific types like `DataSourceCollection`. And so I would have to guess wich References I even need to add. – Christopher Nov 25 '18 at 23:20
  • My bad, I made the massive assumption that it would be obvious, given this is an ML.NET question, that the nuget package 'Microsoft.ML' would be needed. I've added the 'usings' to make it clearer. – Andrew Bennett Nov 26 '18 at 07:27
  • which version of ml.net are you using? – c-chavez Nov 29 '18 at 13:24
  • @c-chavez 4.7.0 – Andrew Bennett Nov 29 '18 at 21:43

1 Answers1

1

Your Prediction object is retrieving the original Label column, instead of the output of the regressor.

Modify the code to be:

public class Prediction
{
    [ColumnName("Score")] public float Output { get; set; }
}

Also note that, by choosing StochasticDualCoordinateAscentRegressor, you are trying to fit a linear model (so, a linear function b + w1*x1 + w2*x2 to the output that is y = x1 XOR x2. There is no linear function that will be close to XOR, and I won't be surprised at all if the learner converges to something arbitrary.

If, on the other hand, you used FastTreeRegressor, you would be learning a decision tree, which will of course have no problem learning the XOR.

Zruty
  • 8,377
  • 1
  • 25
  • 31
  • Absolutely brilliant. Thank you. I wanted to use a gradient descent approach solely because that's my 'HelloWorld' approach, but was fumbling in the dark because there're so many options with ML.NET. Looks like I picked the wrong one, but as you say, the FastTreeRegressor work well. Thanks again. – Andrew Bennett Dec 01 '18 at 20:58