I am training an online-leaning SVM Classifier using SGDClassifier
in sklearn
. I learnt that it is possible using partial_fit
.
My model definition is :
model = SGDClassifier(loss="hinge", penalty="l2", alpha=0.0001, max_iter=3000, tol=1e-3, shuffle=True, verbose=0, learning_rate='invscaling', eta0=0.01, early_stopping=False)
and it is created only the first time.
To test it, I first trained my classifier model 1 on the entire data using fit
and got 87% model accuracy (using model.score(X_test, y_test)
). Then, to demonstrate online training, I broke the same data into 4 sets and then fed all the 4 parts in 4 different run using partial_fit
. This was model 2.
But in this case, my accuracy dropped as: 87.9 -> 98.89 -> 47.7 -> 29.4.
What could be cause for this ?