0

I get the following error when attempting to run KNN on my dataset.

Error: setting an array element with a sequence

This is how my data appears:

my pandas df

This is the dtypes of my pandas df:

f_vector     object
label         int64
mi_vector    object
thermo_op     int64
dtype: object

I then run knn as follows:

train_x = training[['mi_vector','f_vector','thermo_op']]
train_y = training['label']
from sklearn.neighbors import KNeighborsClassifier
neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(train_x, train_y)

This is the ValueError I get when running the above code:

 /home/vikaasa/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.pyc in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    391         # make sure we acually converted to numeric:
    392         if dtype_numeric and array.dtype.kind == "O":
--> 393             array = array.astype(np.float64)
    394         if not allow_nd and array.ndim >= 3:
    395             raise ValueError("Found array with dim %d. %s expected <= 2."

ValueError: setting an array element with a sequence.

This seems like a straightforward problem to me, but I am not able to find a way to fix this .. any help would be greatly appreciated! Thank you.

Vikaasa Ramdas
  • 411
  • 1
  • 5
  • 11
  • Your `f_vector` and `mi_vector` columns are a `list` of values which cannot be understood by the `fit` function as it expects a scalar value at any given time. So, you must find a way to handle that. Maybe splitting those into individual series objects could be a possibility. – Nickil Maveli Dec 01 '16 at 17:38
  • 1
    Thank you, that worked. I also tried making it into a numpy array, and that worked too. – Vikaasa Ramdas Dec 01 '16 at 19:58

0 Answers0