0

I am in the process of converting a binary classification problem to multi-label classification program. The code is written in python.

The below is the existing code:

positive_labels = [[0, 1] for _ in positive_examples]
negative_labels = [[1, 0] for _ in negative_examples]

Now i would like to convert this into a multi-label like 3 classes - 0,1,2

positive_labels = [[1,0,0] for _ in positive_examples]
neutral_labels = [[0,1,0] for _ in neutral_examples]
negative_labels = [[0,0,1] for _ in negative_examples]

Is this correct? If not could you please let me know how to do this?

Please help.

Doubt Dhanabalu
  • 457
  • 4
  • 8
  • 18
  • What do you do with positive_lebels? Currently they are just list of lists (losing any other info they had), how are you using would decide if your solution is correct. – 0xc0de Feb 23 '18 at 05:20

1 Answers1

2

You could use MultiLabelBinarizer in scikit-learn for this

from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer()
# to fit transform you pass the rows of labels
mlb.fit_transform([(0,), (1,),(1,2)])

You get a output like shown below

array([[1, 0, 0],
       [0, 1, 0],
       [0, 1, 1]])

fit_transform method implements the TransformerMixin (http://scikit-learn.org/stable/modules/generated/sklearn.base.TransformerMixin.html). It fits the learn and then transforms it. Once you have called fit_transform, there is no need to call fit again, you just call transform like shown below

mlb.transform([(1,2),(0,1)]) 

array([[0, 1, 1],
       [1, 1, 0]])
vumaasha
  • 2,765
  • 4
  • 27
  • 41