0

I want to use sklearn to do some predict and i stored my data in a Dataframe.

Data = DataFrame(columns = columns,index = range(1,501))

The data has no problem.

from sklearn.cross_validation import train_test_split
Xtrain,Xtest,Ytrain,Ytest = train_test_split(Data[columns[0:5]],Data[columns[5:6]],test_size = 0.25,random_state = 33)

aslo tried:

Xtrain,Xtest,Ytrain,Ytest = train_test_split(np.array(Data[columns[0:5]]),np.array(Data[columns[5:6]]),test_size = 0.25,random_state = 33)
from sklearn.linear_model import LogisticRegression
ss = StandardScaler()
Xtrain = ss.fit_transform(Xtrain)
Xtest = ss.transform(Xtest)
lr = LogisticRegression()
lr.fit(Xtrain,Ytrain)

and the wrong message is:

Traceback (most recent call last): 

    File "/Volumes/sogou_baidu.py", line 148, in <module> 
        lr.fit(Xtrain,Ytrain) 
    File "/Users/liumengyang/anaconda/lib/python3.5/site-packages/skl‌​earn/linear_model/lo‌​gistic.py", line 1143, in 
        fit check_classification_targets(y) 
    File "/Users/liumengyang/anaconda/lib/python3.5/site-packages/skl‌​earn/utils/multiclas‌​s.py", line 173, in 
        check_classification_targets raise ValueError("Unknown label type: %r" % y)

    ValueError: Unknown label type: array([-1, -1, 1, -1, 1, -1, 0, -1, 1, -2, 0, -1, 1, 1, 0, -1, 1, 0, 1, -1, 0, 0, 1, -1, -1, 0, -1, -1, -1, 0, -1, -1, 0, 1, 0, -1, 1], dtype=object)

Under normal circumstances , the parameters of lr.fit() should be two array , but now use the DataFrame as parameter , there is a redundant parameter “dtype=object” , how could i solve this problem ?

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
  • Print the whole stack trace of error. Also the `dtype=object` is not a parameter, it is just information about array on its left. – Vivek Kumar Mar 01 '17 at 07:26
  • /Users/liumengyang/anaconda/lib/python3.5/site-packages/sklearn/utils/validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). ... ValueError: Unknown label type: array([-1, -1, 1, -1, 1, -1, 0, -1, 1, -2, 0, -1, 1, 1, 0, -1, 1, 0, 1, -1, 0, 0, 1, -1, -1, 0, -1, -1, -1, 0, -1, -1, 0, 1, 0, -1, 1], dtype=object) – Mengyang LIU Mar 01 '17 at 08:02
  • This is just last line in stack trace. You should post complete one which starts from the line in your code and then goes all the way to error. – Vivek Kumar Mar 01 '17 at 08:05
  • Traceback (most recent call last): File "", line 1, in runfile('/Volumes/sogou_baidu.py', wdir='/Volumes/predict') File "/Users/liumengyang/anaconda/lib/python3.5/site-packages/spyder/utils/site/sitecustomize.py", line 866, in runfile execfile(filename, namespace) File "/Users/liumengyang/anaconda/lib/python3.5/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace) – Mengyang LIU Mar 01 '17 at 08:09
  • File "/Volumes/sogou_baidu.py", line 148, in lr.fit(Xtrain,Ytrain) File "/Users/liumengyang/anaconda/lib/python3.5/site-packages/sklearn/linear_model/logistic.py", line 1143, in fit check_classification_targets(y) File "/Users/liumengyang/anaconda/lib/python3.5/site-packages/sklearn/utils/multiclass.py", line 173, in check_classification_targets raise ValueError("Unknown label type: %r" % y) – Mengyang LIU Mar 01 '17 at 08:10
  • ValueError: Unknown label type: array([-1, -1, 1, -1, 1, -1, 0, -1, 1, -2, 0, -1, 1, 1, 0, -1, 1, 0, 1, -1, 0, 0, 1, -1, -1, 0, -1, -1, -1, 0, -1, -1, 0, 1, 0, -1, 1], dtype=object) – Mengyang LIU Mar 01 '17 at 08:10
  • Show some samples of y_train and which version of scikit are you using? – Vivek Kumar Mar 01 '17 at 08:24
  • samples:R1 R2 R3 R4 R5 Y :-3 -1 0 0 0 -1; 0 -3 0 2 0 -1;0 1 1 2 -2 1...... – Mengyang LIU Mar 01 '17 at 08:29
  • version is 0.17.1 – Mengyang LIU Mar 01 '17 at 08:32
  • What is R1, R2, .. and why is Y a 2d array? – Vivek Kumar Mar 01 '17 at 09:21
  • try using `sklearn.model_selection.train_test_split` – Levi Mar 02 '17 at 07:35
  • Please, show us `columns`? – Tonechas Mar 02 '17 at 12:07

0 Answers0