0

I want to build a ML model that continuously improves with each incoming input. For this I searched and found that river library of python is best used for online learning and it pays importance to each incoming example while training and learns from one sample at a time. I am experimenting with the standard IRIS dataset using this online learning. 150 rows are there in this dataset. I have trained the model on 145 examples using the learn_one() method of the river library. For the last 5 examples and I am giving input to the model and trying to find if it predicts the correct output or not. If it is not predicting the correct output, it will learn on that data again. But my model is predicting the wrong class each time. I am not able to see any improvements.

Below is the pipeline of my model:

def get_pipeline():
    cat = (
        compose.SelectType(object)
        | preprocessing.StatImputer()
        | preprocessing.OneHotEncoder(sparse=True)
    )
    num = compose.SelectType(numbers.Number) | preprocessing.StatImputer()
    classifier = tree.HoeffdingTreeClassifier()

 

    return (num + cat) | classifier

The below function will train on each example of the dataset:

def train(X, y):
    pipeline = get_pipeline()

 

    # Initialize metrics 
    f1_score = metrics.MicroF1()
    cm = metrics.ConfusionMatrix()

 

    f1_scores = []

 

    # Iterate over the dataset
    for xi, yi in stream.iter_pandas(X, y, shuffle=True, seed=1):
        # Predict the new sample
        yi_pred = pipeline.predict_one(xi)

 

        # Get the score
        if yi_pred is not None:
            f1_score.update(yi, yi_pred)
            f1_scores.append(f1_score.get() * 100)
            cm.update(yi, yi_pred)

 

        # Train the model with the new sample
        pipeline.learn_one(xi, yi)

 

    return f1_scores, cm, pipeline

 


f1_scores, cm, pipeline = train(X, y)

I trained the above model on 145 examples from the dataset and the accuracy came good around 95% enter image description here

But now when I am individually giving user input to the model, it is predicting the wrong label each time Below is the code for it:

import pandas as pd

for i in range(5):
    x = input("Enter the sepal length, sepal width, petal length and petal width separted by spaces:  ")
    elements = x.split()
    row = pd.Series(elements)
    y_pred = pipeline.predict_one(row)
    print('The corresponding species predicted by the model is:',y_pred)
    y_act = input("Please enter the correct species:  ")
    if y_act!=y_pred:
        print("Sorry for the wrong prediction I will use this example to improve myself")
    else:
        print("Hurrayy I have been correctly trained")

    pipeline.learn_one(row, y_act)

Below is the output:


Enter the sepal length, sepal width, petal length and petal width separted by spaces:  6.3 2.5 5.0 1.9
The corresponding species predicted by the model is: Iris-versicolor
Please enter the correct species:  Iris-virginica
Sorry for the wrong prediction I will use this example to improve myself
Enter the sepal length, sepal width, petal length and petal width separted by spaces:  4.4 2.9 1.4 0.2
The corresponding species predicted by the model is: Iris-versicolor
Please enter the correct species:  Iris-setosa
Sorry for the wrong prediction I will use this example to improve myself
Enter the sepal length, sepal width, petal length and petal width separted by spaces:  5.0 3.0 1.6 0.2
The corresponding species predicted by the model is: Iris-versicolor
Please enter the correct species:  Iris-setosa
Sorry for the wrong prediction I will use this example to improve myself
Enter the sepal length, sepal width, petal length and petal width separted by spaces:  6.2 2.2 4.5 1.5
The corresponding species predicted by the model is: Iris-setosa
Please enter the correct species:  Iris-versicolor
Sorry for the wrong prediction I will use this example to improve myself
Enter the sepal length, sepal width, petal length and petal width separted by spaces:  7.2 3.6 6.1 2.5
The corresponding species predicted by the model is: Iris-versicolor
Please enter the correct species:  Iris-virginica
Sorry for the wrong prediction I will use this example to improve myself

enter image description here

I am new to this online Machine learning. Any suggestions to improve this model's accuracy will be of great help

0 Answers0