I am using train_test_split to train and test my data this is an interesting concept to divide the data into training and test, but what if I want to load some data that wasn't in the test data?
My problem is that train_test_split treats data randonly, i'd like to see what label an outside image belongs to.
Currently, I'm extracting 22 features from images and using those features to train linear SVC for recognition, now according to train_test_split I get 94% on the test set, which is alright, what I want to do is simply test it on an image that wasn't in the dataset. train_test_split receives data from a previously loaded dataset for training and testing, but I would like to load the image and test them directly.
Reproducible example: (3 images with 10 features)
import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
y_target = [1]*1 + [2]*1 + [3]*1 # number of images per person
data = np.asarray([[152., 236., 228., 168., 236., 224., 70., 223., 175., 195.],
[140., 233., 226., 161., 234., 220., 67., 220., 159., 194.],
[135., 233., 225., 157., 234., 221., 65., 220., 159., 193.]])
svc_ = SVC(kernel='linear', C=0.00005)
A_train, A_test, b_train, b_test = train_test_split(
data, y_target, test_size=0.25, random_state=0)
def train(clf, A_train, A_test, b_train, b_test):
clf.fit(A_train, b_train)
print ("Accuracy on training set:")
print (clf.score(A_train, b_train))
train(svc_, A_train, A_test, b_train, b_test)
For instance, how would I test the following image's features?
([[126., 232., 225., 149., 231., 222., 60., 218., 152., 191.]])
So, what i am doing is selecting a specific image, editing it a bit then i'd like to see how my classifier does in the testing for this image that was edited, that wasnt trained and it wasnt in the dataset, for instance if i picked an image from the internet, how would i test it??