I am doing some facial recognition training using linear SVC, where my dataset is 870x22. I have 30 frames for 29 different person, where i am using 22 simple value pixels in the image to recognize the face image, said 22 pixels are my features. Also, when i call train_test_split(), it'll give me a X_test of size 218x22 and y_test of size 218. Once i have trained the classifier and i try to run images of a new face (30x22) matrix, it gives me the error:
ValueError: Found input variables with inconsistent numbers of samples: [218, 30]
Here's the code:
import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score, f1_score
img_amount = 30
target = np.asarray([1]*img_amount + [2]*img_amount + [3]*img_amount + [4]*img_amount + [5]*img_amount + [6]*img_amount + [7]*img_amount + [8]*img_amount + [9]*img_amount + [10]*img_amount + [11]*img_amount + [12]*img_amount + [13]*img_amount + [14]*img_amount + [15]*img_amount + [16]*img_amount + [17]*img_amount + [18]*img_amount + [19]*img_amount + [20]*img_amount + [21]*img_amount + [22]*img_amount + [23]*img_amount + [24]*img_amount + [25]*img_amount + [26]*img_amount + [27]*img_amount + [28]*img_amount + [29]*img_amount)
dataset= dataset[:, 0:22]
svc_1 = SVC(kernel='linear', C=0.00005)
X_train, X_test, y_train, y_test = train_test_split( dataset, target, test_size=0.25, random_state=0)
def train(clf, X_train, X_test, y_train, y_test):
clf.fit(X_train, y_train)
print ("Accuracy on training set:")
print (clf.score(X_train, y_train))
print ("Accuracy on testing set:")
print (clf.score(X_test, y_test))
y_pred = clf.predict(X_test)
print ("Classification Report:")
print (metrics.classification_report(y_test, y_pred))
print ("Confusion Matrix:")
print (metrics.confusion_matrix(y_test, y_pred))
train(svc_1, X_train, X_test, y_train, y_test)
print ("Classification Report:")
print (metrics.classification_report(y_test, new_face_img))
In order to not visually pollute the question, i uploaded to pastebin the matrix for new_face_img: https://pastebin.com/uRbvv5jD
Link for the dataset: Dataset
They are just arrays and can be passed directly to their variables
The lines i get the error on, are when i try to predict new samples:
predictions = svc_1.predict(new_face_img)
print ("Classification Report:")
->>>>print (metrics.classification_report(y_test, predictions))
predictions = svc_1.predict(michael_ocluded_array)
expected=np.ones(len(michael_ocluded_array))
print ("Confusion Matrix:")
print (metrics.confusion_matrix(expected, predictions))
Confusion Matrix: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 1 predictions = svc_1.predict(michael_ocluded_array) 2 print ("Confusion Matrix:") ----> 3 print (metrics.classification_report(y_test, predictions))
C:\ProgramData\Miniconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs) 70 FutureWarning) 71 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)}) ---> 72 return f(**kwargs) 73 return inner_f 74
C:\ProgramData\Miniconda3\lib\site-packages\sklearn\metrics_classification.py in classification_report(y_true, y_pred, labels, target_names, sample_weight, digits, output_dict, zero_division) 1927 """
1928 -> 1929 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 1930 1931 labels_given = TrueC:\ProgramData\Miniconda3\lib\site-packages\sklearn\metrics_classification.py in _check_targets(y_true, y_pred) 79 y_pred : array or indicator matrix 80 """ ---> 81 check_consistent_length(y_true, y_pred) 82 type_true = type_of_target(y_true) 83 type_pred = type_of_target(y_pred)
C:\ProgramData\Miniconda3\lib\site-packages\sklearn\utils\validation.py in check_consistent_length(*arrays) 253 uniques = np.unique(lengths) 254 if len(uniques) > 1: --> 255 raise ValueError("Found input variables with inconsistent numbers of" 256 " samples: %r" % [int(l) for l in lengths]) 257
ValueError: Found input variables with inconsistent numbers of samples: [218, 30]