i wrote a simple program to classify a set of linearly separable 2D random points. I used a perceptron and i trained it with the fit method. Now i'd like to train the perceptron one point at time, plotting each time the hyperplane (a line in this case) using the updated weights. What i want to obtain is an animation which shows how the line become more and more precise dividing the sets. The fit method takes the entire training set, what about the partial_fit? Can i make a loop where i feed the method each time with a new single couple of input/output, and read continuously the coef_ and intercept_?
I read the documentation here http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html but i have some doubts on how to implement it.
EDIT 1
Thanks to Vivek Kumar, i implemented the partial_fit method to my code. The program creates 2 sets of coordinates, and for each couple produces an output which is 1 if the point is over a line and -1 if it is under the line. The code works with the fit method, but this versions gives some problems with the data shape. I tried to use reashape for the X data without any improvement.
import numpy as np
import matplotlib.pyplot as plt
def createLinearSet(nCamp, mTest, qTest):
y_ = []
X_ = np.random.rand(nCamp, 2)*20-10
for n in range(nCamp):
if X_[n][1] >= mTest*X_[n][0]+qTest :
y_.append(1)
else:
y_.append(-1)
return X_, y_
########################################################################
# VARIABLES
iterazioni = 100
eta = 0.6
y = []
error = []
########################################################################
# CREATING DATA SET
m_test = -2
q_test = 3
n_camp = 100
X, y = createLinearSet(n_camp, m_test, q_test)
########################################################################
# 70 % training data and 30 % test data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)
########################################################################
# Data normalization
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(X_train) # Calcola la media dei campioni e la deviazione standard
X_train_std = sc.transform(X_train) # Normalizza i dati di test e di addestramento
X_test_std = sc.transform(X_test) # NB. uso media e deviazione dei dati di add. per entrambi,
# così sono confrontabili
########################################################################
# Perceptron initialization
from sklearn.linear_model import Perceptron
ppn = Perceptron(n_iter = iterazioni, eta0 = eta, random_state = 0)
########################################################################
# Online training
num_samples = X_train_std.shape[0]
classes_y = np.unique(y_train)
X_train_std = X_train_std.reshape(-1, 2)
for i in range(num_samples):
ppn.partial_fit(X_train_std[i], y_train[i], classes = classes_y )
########################################################################
# Using test data for evaluation
y_pred = ppn.predict(X_test_std)
########################################################################
# Previsions accuracy
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_pred) * 100
print("Accuracy: {} %".format(round(accuracy,2)))
print(ppn.coef_, ppn.intercept_)
As you can see, the problem is in the "Online training" section. The error is:
/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py:395: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
From documentation, X must be: X : {array-like, sparse matrix}, shape (n_samples, n_features)
If i print a single sample of X, the output is: [-0.25547959 -1.4763508 ]
Where is the error?
EDIT 2
Putting the line X_train_std[i].reshape(1,-1)
in the loop it gives me this message:
Traceback (most recent call last):
File "Perceptron_Retta_Online.py", line 57, in <module>
ppn.partial_fit(X_train_std[i].reshape(1,-1), y_train[i], classes = classes_y )
File "/usr/local/lib/python3.5/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 512, in partial_fit
coef_init=None, intercept_init=None)
File "/usr/local/lib/python3.5/dist-packages/sklearn/linear_model/stochastic_gradient.py", line 344, in _partial_fit
X, y = check_X_y(X, y, 'csr', dtype=np.float64, order="C")
File "/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py", line 526, in check_X_y
y = column_or_1d(y, warn=True)
File "/usr/local/lib/python3.5/dist-packages/sklearn/utils/validation.py", line 562, in column_or_1d
raise ValueError("bad input shape {0}".format(shape))
ValueError: bad input shape ()