I'm having a problem building an ML.Net pipeline. I've read through ALOT of Microsoft documentation, but I think the problem is I just don't understand it. Was wondering if I could get some help from this community?
What I'm trying to do is to predict when a train will be called. I have gathered alot of data. I've put this data into a CSV file. The first column is when the train is predicted to be called. The second column is when the train was actually called. The data is in Unix Timestamp format. (I can put the data into C# DateTime format if that's easier)
Here is a sample of the data:
1682556540,1682571900
1682760480,1682786700
1683057540,1683056460
1683269880,1683274500
1683456840,1683445500
1683612960,1683814800
1684001940,1683975900
1684194420,1684203600
This is the code I have so far. All of this code I have copied from various code samples and tutorials I've been looking at. I've been going through the Microsoft documentation to TRY to understand each line. Like I said, the pipeline has me stumped right now.
using Microsoft.ML;
using Microsoft.ML.Data;
namespace TrainPrediction
{
class TrainData
{
[LoadColumn(0)]
public float PredictedTime;
[LoadColumn(1)]
public float ActualTime;
}
class Prediction
{
[ColumnName("Score")]
public float PredictedTime;
}
class Program
{
static void Main(string[] args)
{
var mlContext = new MLContext();
// Load the data
var dataPath = @"d:\temp\aiengine-601.csv";
var dataView = mlContext.Data.LoadFromTextFile<TrainData>(dataPath, separatorChar: ',');
// Define the pipeline
var pipeline = mlContext.Transforms.Conversion.MapValueToKey("Label")
.Append(mlContext.Transforms.Concatenate("Features", "PredictedTime"))
.Append(mlContext.Transforms.NormalizeMinMax("Features"))
.Append(mlContext.Transforms.Conversion.MapKeyToValue("Label"))
.Append(mlContext.Regression.Trainers.FastTree());
// Train the model
var model = pipeline.Fit(dataView);
// Create a prediction engine
var predictionEngine = mlContext.Model.CreatePredictionEngine<TrainData, Prediction>(model);
// Prompt the user for a prediction time
Console.Write("Enter a prediction time (Unix timestamp): ");
if (float.TryParse(Console.ReadLine(), out float inputTime))
{
var inputData = new TrainData { PredictedTime = inputTime };
var prediction = predictionEngine.Predict(inputData);
// Convert the predicted time back to Unix timestamp
var predictedTime = Math.Round(prediction.PredictedTime);
Console.WriteLine($"ML.NET predicts the train will be called at: {predictedTime}");
}
else
{
Console.WriteLine("Invalid input!");
}
}
}
}
When I run this code, I'm getting an error when I train the model (.Fit). It states "System.ArgumentOutOfRangeException: 'Could not find input column 'Label' {Parameter 'inputSchema')'
I believe I'm getting this error because my pipeline is not correct.
What I'm asking is if anyone could help me get the correct pipeline, and if you feel really frisky, explain the details of the pipeline.
I'm currently looking online for a "Dummies guide to pipelines" type of explanation.