0

I am trying to visualise SVM classification results using Matplotlib and Scikit-learn, how to handle MemoryError ?!

For my example, I have a small dataset, a table X of 100 examples and 10 features (data table of integer). I did perform classification using SVM of Scikit learn, then I want to visualize the results. But since I have 10 features I can't visualize them directly, so I used PCA after classification to reduce the dimensionality of my data. It did work on IRIS dataset but for my data, it crashes giving me "MemoryError"


#SVM classification
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.20)  
svclassifier = SVC(kernel='linear',gamma='auto',max_iter=1000, decision_function_shape='ovo')  
models=svclassifier.fit(X_train, y_train) 
y_pred = svclassifier.predict(X_test)


#Plot funtions
def make_meshgrid(x, y, h=.02):
     x_min, x_max = x.min() - 1, x.max()+1
     y_min, y_max = y.min() - 1, y.max()+1
     xx, yy = np.meshgrid(np.arange(x_min, x_max, h),np.arange(y_min, 
         y_max, h))
     return xx, yy

def plot_contours(ax, clf, xx, yy, **params):
     Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
     Z = Z.reshape(xx.shape)
     out = ax.contourf(xx, yy, Z, **params)
     return out


#PCA for D.R
pca = PCA(n_components=2)
pca.fit(X)
X_pca = pca.transform(X)

print("original shape:   ", X.shape)
print("transformed shape:", X_pca.shape)
X=X_pca

#Ploting results
fig, sub = plt.subplots()
plt.subplots_adjust(wspace=0.4, hspace=0.4)
X0, X1 = X[:, 0].flatten(), X[:, 1].flatten()
xx, yy = make_meshgrid(X0, X1)
plot_contours(sub, models, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
sub.scatter(X0, X1, c=Y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
sub.set_xlim(xx.min(), xx.max())
sub.set_ylim(yy.min(), yy.max())
sub.set_xlabel('Sepal length')
sub.set_ylabel('Sepal width')
sub.set_xticks(())
sub.set_yticks(())
sub.set_title("TITLE")
plt.show()
original shape:    (100, 10)
transformed shape: (100, 2)
MySQL connection is closed
Traceback (most recent call last):
  File "new_data.py", line 123, in <module>
    xx, yy = make_meshgrid(X0, X1)
  File "new_data.py", line 81, in make_meshgrid
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),np.arange(y_min, y_max, h))
  File "/home/.local/lib/python3.5/site-packages/numpy/lib/function_base.py", line 4211, in meshgrid
    output = [x.copy() for x in output]
  File "/home/.local/lib/python3.5/site-packages/numpy/lib/function_base.py", line 4211, in <listcomp>
    output = [x.copy() for x in output]
MemoryError
Abdessamad139
  • 325
  • 4
  • 16
  • 1
    How long are `len(np.arange(x_min, x_max, h))` and `len(np.arange(y_min, y_max, h))`? – Asmus Apr 27 '19 at 17:25
  • 1
    Why do you need a spacing of `h=.02`? Try with different spacing like `h=.2`, `h=.5` etc. – Sheldore Apr 27 '19 at 17:32
  • Thanks for your answer @Asmus in fact in this case: len(np.arange(x_min, x_max, h))=100542 len(np.arange(y_min, y_max, h))=12752 – Abdessamad139 Apr 27 '19 at 18:03
  • Thanks for your answer @Sheldore: whenever I try to change the spacing to a higher value it gives me the error : ```Error ValueError: X.shape[1] = 2 should be equal to 10, the number of features at training time ``` – Abdessamad139 Apr 27 '19 at 18:04
  • 2
    Then you're trying to hold 10.256.892.672 bytes in memory, I guess you really *really* should try to call `make_meshgrid(X0, X1,h=.5)`, as @Sheldore suggested. Also: have a [look at this](https://stackoverflow.com/questions/22167095) **Edit:** isn't your ValueError due to `PCA(n_components=2)` having shape `(100, 2)`? – Asmus Apr 27 '19 at 18:13
  • 1
    Also look [here](https://stackoverflow.com/questions/50975701/), [here](https://stackoverflow.com/questions/22581838/), and [here](https://stackoverflow.com/questions/22167095/valueerror-while-using-linear-svm-of-scikit-learn-python) – Asmus Apr 27 '19 at 18:21

0 Answers0