10

When passing x,y in fit, I am getting the following error:

Traceback (most recent call last):

File "C:/Classify/classifier.py", line 95, in

train_avg, test_avg, cms = train_model(X, y, "ceps", plot=True)
File "C:/Classify/classifier.py", line 47, in train_model

clf.fit(X_train, y_train) File "C:\Python27\lib\site-packages\sklearn\svm\base.py", line 676, in fit raise ValueError("The number of classes has to be greater than" ValueError: The number of classes has to be greater than one.

Below is my code:

def train_model(X, Y, name, plot=False):
"""
    train_model(vector, vector, name[, plot=False])

    Trains and saves model to disk.
"""
labels = np.unique(Y)

cv = ShuffleSplit(n=len(X), n_iter=1, test_size=0.3, indices=True, random_state=0)

train_errors = []
test_errors = []

scores = []
pr_scores = defaultdict(list)
precisions, recalls, thresholds = defaultdict(list), defaultdict(list), defaultdict(list)

roc_scores = defaultdict(list)
tprs = defaultdict(list)
fprs = defaultdict(list)

clfs = []  # for the median

cms = []

for train, test in cv:
    X_train, y_train = X[train], Y[train]
    X_test, y_test = X[test], Y[test]

    clf = LogisticRegression()
    clf.fit(X_train, y_train)
    clfs.append(clf)
VKS
  • 487
  • 2
  • 7
  • 21

2 Answers2

28

You probably have only one unique class label in the training set present. As the error messages noted, you need to have at least two unique classes in the dataset. E.g., you can run np.unique(y) to see what the unique class labels in your dataset are.

RAJ
  • 9,697
  • 1
  • 33
  • 63
  • Just see you already have `labels = np.unique(Y)` in your code example. Just at a `print`, e.g., `labels = np.unique(Y); print(labels)` –  Nov 25 '16 at 02:53
  • it is printing : File "C:/Users/vshah/PycharmProjects/Classify/classifier.py", line 101, in Starting classification train_avg, test_avg, cms = train_model(X, y, "ceps", plot=True) File "C:/Users/vshah/PycharmProjects/Classify/classifier.py", line 54, in train_model Classification running ... clf.fit(X_train, y_train) File "C:\Python27\lib\site-packages\sklearn\svm\base.py", line 665, in fit [] raise ValueError("The number of classes has to be greater than one" – VKS Nov 25 '16 at 12:18
  • What you pasted above is not the print result but the error message that you posted earlier. You need to `print` it in a line before you encounter the error otherwise `print(labels)` will not get executed and you wouldn't now what the variable `labels` contains. –  Nov 25 '16 at 19:00
2

Exactly. your last column (label) has only one type (Classification). you should have at least two. For example; if your label is to decide either you have to offload or not, the label column should have offload and not-offload or (0 or 1).

MOBT
  • 312
  • 3
  • 10