0


I am trying to learn sklearn and for this, I am trying a simple exercise with a linear SVM. The SVC tries to predict the number of bedrooms in a house, based on the value of the house and its area.
I have managed to get something that looks ok, but the template I took from matplotlib's documentation uses a color map and I don't know exactly what corresponds to what.

How could I add a legend that specifies what the color of each scattered point corresponds to, and what the SVM's sections correspond to as well?

Also, in order to make the same work, I had to preprocess.scale my features, and the ticks now have the preprocessed value ;( How could I unscale somehow or retrieve the original values to use for the graduation.

Here is the plot:

https://i.stack.imgur.com/bigiR.png (I don't have enough reputation to post directly)

And here is my code:

style.use('ggplot')

dataset = pd.read_csv('/Path/Paros.csv')
dataset = dataset[dataset['size']<3000]
X = np.array(dataset[['size', 'value']])
y = np.array(dataset[['bedrooms']])
X = preprocessing.scale(X)

h = 0.01  # step size in the mesh
C = 0.01  # SVM regularization parameter
clf = svm.SVC(kernel='linear', C=C).fit(X, y[:,0])

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
print "mesh"
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))


Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
plt.xlabel('Size')
plt.ylabel('Price')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())


plt.show()
AimiHat
  • 383
  • 4
  • 14
  • (1) Try plt.colorbar() and if it's not what you want, plot the output. (2) Use the object-oriented usage of [StandardScaler](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html) for scaling (as opposed to your functional approach). This object supports ```inverse_transform```. – sascha Jul 23 '16 at 18:04
  • Thec colobar worked great! But I don't know what to do with the StandardScaler inverse_transform.. I replaced my preprocessing.scale(X) with the StandardScaler, but I don't know where to inverse the data to make it work ? – AimiHat Jul 23 '16 at 18:35
  • I'm pretty sure there are good docs within the user-guide! – sascha Jul 23 '16 at 19:07
  • There's actually nothing on the inverse in the docs :( – AimiHat Jul 23 '16 at 19:16
  • What's unclear about [this](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler.inverse_transform)? scaler.transform(x) = z. scaler.inverse_transform(z) = x (transform and inverse_transform may be called with different data; only the feature-size needs to be the same) – sascha Jul 23 '16 at 19:19
  • When I reverse transform X before scattering, the data gets back to the normal proportions, but the plotted svm (the contourf) is still at the scaled proportions, and so is the min, max, xx and yy.. If i inverse_transform before that, there's no scaling at all in the end, and I get a memoryerror with the meshgrid – AimiHat Jul 23 '16 at 19:33

1 Answers1

1

plt.colorbar() did what I was looking for.

AimiHat
  • 383
  • 4
  • 14