0

I'm newbie in Scikit-learn and classification. My task is a multi-label classification problem. AS I understand predict returns array with n tuples which is the same as amount of features in sample. What does it mean? How I can get strict order and strict amount of predicted values? Because x_test = X_train[0] output Result [('a', 'c'), (), ()] and x_test = X_train[0] outputs Result [('a',), (), ()]

import numpy as np
from sklearn.svm import SVC
from sklearn.multiclass import OneVsRestClassifier
from sklearn.preprocessing import MultiLabelBinarizer

input_data = [
  [0, 2, 0, 'a', 'c'],
  [0, 2, 0, 'a', 'c'],
  [0, 2, 0, 'a', 'c'],
  [0, 1, 0, 'a', 'c'],
  [1, 2, 1, 'b', 'e'],
  [1, 2, 0, 'b', 'd'],
  [1, 2, 0, 'a', 'e'],
  [1, 2, 0, 'a', 'd'],
  [1, 1, 0, 'a', 'c']
]
X = [x[0:3] for x in input_data]
y = [x[-2:] for x in input_data]


X_train = np.array(X)
y_train = np.array(y)
mlb = MultiLabelBinarizer()
y_train = mlb.fit_transform(y_train)

classifier = OneVsRestClassifier(SVC())

classifier.fit(X_train, y_train)

x_test = X_train[0]
result = classifier.predict(x_test)
labels = mlb.inverse_transform(result)
print("Result %s" % labels)
polianych
  • 252
  • 3
  • 7

1 Answers1

0

Your result is positive or negative for your five classes, so, 'a' 'c' for test[0] and 'a' for the second one. The purpose of multilabel classification is to label each data by 0-5 of your five classes.

If you want strictly two labels you can run two single label classifications, it could be sufficient.

Alexis Benichoux
  • 790
  • 4
  • 13