I am trying to plot and create a visual decision boundary for my SVM model. A bit of background about the dataset and task. It's a binary classification task and I am classifying news articles as either fake or real. I would like to visually see this decision boundary. The graph code snippet is from here: https://medium.com/geekculture/svm-classification-with-sklearn-svm-svc-how-to-plot-a-decision-boundary-with-margins-in-2d-space-7232cb3962c0
I tried using the 'test_x_vectorize' as the x and y but I would get the error of: TypeError: unhashable type: 'csr_matrix'
I then tried flattening as per this thread but it gave me the same issue. TypeError: unhashable type: 'matrix'
here is my code:
# test/train split for X and Y
X_train, X_test, Y_train, Y_test = train_test_split(data['News'], data['Label'], test_size=0.2, random_state=21, shuffle=True)
# Creating the vectorizer using TfidfVectorizer
vectorize = TfidfVectorizer(max_features=5)
vectorize.fit(data['News'])
train_x_vectorize = vectorize.transform(X_train)
test_x_vectorize = vectorize.transform(X_test)
# Creating the SVM model
SVM = svm.SVC(C=1, kernel='linear', degree=3, gamma='auto')
SVM.fit(train_x_vectorize, Y_train)
# Predicting the accuracy on testing data
pred = SVM.predict(test_x_vectorize)
plt.figure(figsize=(10, 8))
# Plotting our two-features-space
sns.scatterplot(x=X_train[:, 0],
y=X_train[:, 1],
hue=Y_train,
s=8);
# Constructing a hyperplane using a formula.
w = SVM.coef_[0] # w consists of 2 elements
b = SVM.intercept_[0] # b consists of 1 element
x_points = np.linspace(-1, 1) # generating x-points from -1 to 1
y_points = -(w[0] / w[1]) * x_points - b / w[1] # getting corresponding y-points
# Plotting a red hyperplane
plt.plot(x_points, y_points, c='r');
My traceback error is here:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-8-0535339c273a> in <module>()
30 plt.figure(figsize=(10, 8))
31 # Plotting our two-features-space
---> 32 sns.scatterplot(x=X_train[:, 0],
33 y=X_train[:, 1],
34 hue=Y_train,
2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in _get_values_tuple(self, key)
954
955 if not isinstance(self.index, MultiIndex):
--> 956 raise KeyError("key of type tuple not found and not a MultiIndex")
957
958 # If key is contained, would have returned by now
KeyError: 'key of type tuple not found and not a MultiIndex'