I'm trying to classify mobiles according to their features but when I apply the gaussian NB code through sklearn , I'm unable to do so because of the following error : the code :
clf = GaussianNB()
clf.fit(X_train,y_train)
GaussianNB()
accuracy = clf.score(X_test,y_test)
print(accuracy)
error:
ValueError Traceback (most recent call last)
<ipython-input-18-e9515ccc2439> in <module>()
2 clf.fit(X_train,y_train)
3 GaussianNB()
----> 4 accuracy = clf.score(X_test,y_test)
5 print(accuracy)
/Users/kiran/anaconda/lib/python3.6/site-packages/sklearn/base.py in score(self, X, y, sample_weight)
347 """
348 from .metrics import accuracy_score
--> 349 return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
350
351
/Users/kiran/anaconda/lib/python3.6/site-packages/sklearn/naive_bayes.py in predict(self, X)
63 Predicted target values for X
64 """
---> 65 jll = self._joint_log_likelihood(X)
66 return self.classes_[np.argmax(jll, axis=1)]
67
/Users/kiran/anaconda/lib/python3.6/site-packages/sklearn/naive_bayes.py in _joint_log_likelihood(self, X)
422 check_is_fitted(self, "classes_")
423
--> 424 X = check_array(X)
425 joint_log_likelihood = []
426 for i in range(np.size(self.classes_)):
/Users/kiran/anaconda/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
380 force_all_finite)
381 else:
--> 382 array = np.array(array, dtype=dtype, order=order, copy=copy)
383
384 if ensure_2d:
ValueError: could not convert string to float:
My dataset has been scraped so it contains string as well as float values. It would be helpful if someone could suggest me how I can clean the data and avoid the error .