Getting distance to the hyperplane from sklearn's svm.svc

Question

I'm currently using svc to separate two classes of data (the features below are named data and the labels are condition). After fitting the data using the gridSearchCV I get a classification score of about .7 and I'm fairly happy with that number. After that though I went to get the relative distances from the hyper-plane for data from each class using grid.best_estimator_.decision_function() and plot them in a boxplot and a histogram to get a better idea of how much overlap there is. My problem is that in the histogram and the boxplot these look perfectly seperable shich I know is not the case. I'm sure I'm calling decision_function() incorrectly but not sure how to do this really.

    svc=SVC(kernel='linear,probability=True,decision_function_shape='ovr')
cv=KFold(n_splits=4,shuffle=True)
svc=SVC(kernel='linear,probability=True,decision_function_shape='ovr')
C_range=[1,.001,.005,.01,.05,.1,.5,5,50,10,100]
param_grid=dict(C=C_range)
grid=GridSearchCV(svc,param_grid=param_grid, cv=cv,n_jobs=4,iid=False, refit=True)
grid.fit(data,condition)
print grid.best_params
print grid.best_score_

x=grid.best_estimator_.decision_function(data)
plt.hist(x)    
sb.boxplot(condition,x)
sb.swarmplot

In the histogram and box plots it looks like almost all of the points have a distance of either exactly positive or negative one with nothing between them.

Getting distance to the hyperplane from sklearn's svm.svc

0 Answers0