1

I am trying to import sklearn.neighbors in Python, and from there import KNeighborsClassifier. When I try to execute it in Python, I get a ValueError:

ValueError(u"Invalid mode, expected 'c' or 'fortran', got f\x00o\x00r\x00t",) in <module 'threading' from '/home/sjain55/anaconda/lib/python2.7/threading.pyc'> ignored' .

A little gist of the code :

from sklearn.neighbors import KNeighborsClassifier

neigh = KNeighborsClassifier(n_neighbors=10)

selected_features = X[:, idx[0:num_fea]]

neigh.fit(selected_features[train], y[train]) //this is the line giving me the above error.

I've tried searching, but wasn't able to find why I am receiving such an error. Does anyone have any idea why I'm receiving this error?

Data printed on verbose

s1 : 18863

s : 11062

check2

I was called with 2464 arguments:

(1440, 1)

check

ck1

{'fisher_score': True, 'y': array([ 1, 1, 1, ..., 20, 20, 20], dtype=uint8), 'neighbor_mode': 'supervised'}

/home/sjain55/Desktop/FS_Package_DMML-master/FS_package/function/similarity_based/fisher_score.py:47: RuntimeWarning: divide by zero encountered in divide score = 1.0/lap_score - 1

Exception ValueError: ValueError(u"Invalid mode, expected 'c' or 'fortran', got f\x00o\x00r\x00t",) in ignored

Sajal Jain
  • 263
  • 1
  • 3
  • 11
  • what is the form of your `selected_features[train]` and `y[train]`? – farhawa May 22 '15 at 18:37
  • selected_features = X[:, idx[0:num_fea]] //here X is a 2D array – Sajal Jain May 22 '15 at 18:45
  • Do you have the last version of your packages ? I was running scikit learn v0.11 and had this problem. After updating scipy, and scikitlearn, it disapeared. – Romain Jouin May 25 '15 at 10:15
  • I am using latest versions of scipy(0.15.1) and sklearn(0.16.1) . Also when I am running the code through Python only, it executes. But when I embed Python in C++ it bombed out. – Sajal Jain May 26 '15 at 16:27

2 Answers2

0

I think I figured out your problem, sklearn impose in some model that the data be in a specific order called Fortran/C order to fix that issues just add the order='F' or order='C' to your trainX and trainY data this way:

selected_features[train] = numpy.array(selected_features[train],order='F')
y[train] = numpy.array(y[train],order='F')

Note:

Order argument Specify the order of the array. If order is ‘C’, then the array will be in C-contiguous order (last-index varies the fastest). If order is ‘F’, then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is ‘A’, then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous).

farhawa
  • 10,120
  • 16
  • 49
  • 91
0

Your problem is your strings are encoded by default with utf8 and you are calling something expecting ASCII strings. The string f\x00o\x00r\x00t are the first 7 bytes of the string literal u"fortran" which is the default string encoding in python 3 or in 2 if file encoding is set to utf8.

Change "fortran" to r"fortran" to pass a raw ASCII string instead of a utf one. If r"fortran" still gives you wide characters you can force an old (ASCII) string by using str("fortran").

casey
  • 6,855
  • 1
  • 24
  • 37