5

I just try to make my first ML.NET project, that I have built before with Azure ML, Visual Interface, Python and so on, but now I wanted to do it with C#.

I was following this tutorial, but with a totally different dataset and purpose.

The dataset has a lot of extra columns, but my data model looks like the following (pointing on the index of the column in the dataset):

using Microsoft.ML.Data;

namespace ML_Net
{

    public class Earthquake
    {
        [LoadColumn(1)]
        public int geo_level_1_id { get; set; }
        [LoadColumn(2)]
        public int geo_level_2_id { get; set; }
        [LoadColumn(3)]
        public int geo_level_3_id { get; set; }
        [LoadColumn(4)]
        public int count_floors_pre_eq { get; set; }
        [LoadColumn(5)]
        public int age { get; set; }
        [LoadColumn(6)]
        public int area { get; set; }
        [LoadColumn(7)]
        public int height { get; set; }
        [LoadColumn(8)]
        public int count_families { get; set; }
        [LoadColumn(26)]
        public int has_secondary_use { get; set; }
        [LoadColumn(27)]
        public double square { get; set; }
        [LoadColumn(39)]
        public double difference { get; set; }
        [LoadColumn(40)]
        public int damage_grade { get; set; }
    }

    public class DamagePrediction
    {
        [ColumnName("PredictedLabel")]
        public int damage_grade;
    }
}

The error comes from the training function:

public static IEstimator<ITransformer> BuildAndTrainModel(IDataView trainingDataView, IEstimator<ITransformer> pipeline)
{
    var trainingPipeline = pipeline
        .Append(_mlContext.MulticlassClassification.Trainers
        .SdcaMaximumEntropy("Label", "Features"))
        .Append(_mlContext.Transforms.Conversion
        .MapKeyToValue("PredictedLabel"));

    _trainedModel = trainingPipeline.Fit(trainingDataView);
    _predEngine = _mlContext.Model
        .CreatePredictionEngine<Earthquake, DamagePrediction>(_trainedModel);

    Earthquake building = new Earthquake()
    {
        geo_level_1_id = 1,
        geo_level_2_id = 42,
        geo_level_3_id = 941,
        count_floors_pre_eq = 2,
        age = 0,
        area = 24,
        height = 4,
        count_families = 2,
        has_secondary_use = 0,
        square = 4.898979485566356,
        difference = 0.8989794855663558
    };

    var prediction = _predEngine.Predict(building);
    Console.WriteLine($"=============== Single Prediction just-trained-model - Result: {prediction.damage_grade} ===============");


    return trainingPipeline;
}

Which says:

Exception thrown: 'System.ArgumentOutOfRangeException' in Microsoft.ML.Data.dll An unhandled exception of type 'System.ArgumentOutOfRangeException' occurred in Microsoft.ML.Data.dll Schema mismatch for feature column 'Features': expected Vector < Single >, got Vector < Int32 >

I cannot seem to understand what the problem is, can you help me please with some ideas?

I work with only numerical data which is why I didn't add transformation or featurization, but maybe normalization could help.. As I have some floats..

Thank you in advance for all the ideas!

Eve
  • 604
  • 8
  • 26
  • I think the problem is that you've declared some properties in `Earthquake` as `double` when they should be `float` (aka `System.Single`) – grooveplex Oct 17 '19 at 15:53
  • Seriously that's it? :O I give it a try later today, thanks :) – Eve Oct 17 '19 at 16:00
  • The mapping between the database and the c# code is not matching. Your Loadcolumn indexes could be wrong. The index probably starts at zero and you are starting at one. – jdweng Oct 17 '19 at 16:01
  • @Eva I'm not sure, it's just a guess – grooveplex Oct 17 '19 at 16:02
  • @jdweng, I index from 0, I load the columns like that, where they are in the dataset, at index 0, or something else... but I review whether I numbered it correctly, thanks :) – Eve Oct 17 '19 at 19:10
  • @grooveplex they are actually double, now I get an error when I switch to float that it cannot be converted from double to float. – Eve Oct 17 '19 at 19:12
  • So the vector could be converted somehow, now I try to figure out where and how.. :) – Eve Oct 17 '19 at 19:57

0 Answers0