I'm trying to do some classification problems using sklearn for the first time in Python, and was wondering what was the best way to go about calculating the error of my classifier (like a SVM) solely on the training data.
My sample code for calculating accuracy and rmse are as follows:
svc = svm.SVC(kernel='rbf', C=C, decision_function_shape='ovr').fit(X_train, y_train.ravel())
prediction = svc.predict(X_test)
svm_in_accuracy.append(svc.score(X_train,y_train))
svm_out_rmse.append(sqrt(mean_squared_error(prediction, np.array(list(y_test)))))
svm_out_accuracy.append((np.array(list(y_test)) == prediction).sum()/(np.array(list(y_test)) == prediction).size)
I know from 'sklearn.metrics import mean_squared_error' can pretty much get me the MSE for an out-of-sample comparison. What can I do in sklearn to give me an error metric on how my well/not well my model misclassified on the training data? I ask this because I know my data is not perfectly linearly separable (which means the classifier will misclassify some items), and I want to know the best way to get an error metric on how much it was off. Any help would be appreciated!