17

I have a test dataset and train dataset as below. I have provided a sample data with min records, but my data has than 1000's of records. Here E is my target variable which I need to predict using an algorithm. It has only four categories like 1,2,3,4. It can take only any of these values.

Training Dataset:

A    B    C    D    E
1    20   30   1    1
2    22   12   33   2
3    45   65   77   3
12   43   55   65   4
11   25   30   1    1
22   23   19   31   2
31   41   11   70   3
1    48   23   60   4

Test Dataset:

A    B    C    D    E
11   21   12   11
1    2    3    4
5    6    7    8 
99   87   65   34 
11   21   24   12

Since E has only 4 categories, I thought of predicting this using Multinomial Logistic Regression (1 vs Rest Logic). I am trying to implement it using python.

I know the logic that we need to set these targets in a variable and use an algorithm to predict any of these values:

output = [1,2,3,4]

But I am stuck at a point on how to use it using python (sklearn) to loop through these values and what algorithm should I use to predict the output values? Any help would be greatly appreciated

asongtoruin
  • 9,794
  • 3
  • 36
  • 47
Sriram Chandramouli
  • 191
  • 1
  • 1
  • 10
  • 1
    this tutorial should be a good place to start http://scikit-learn.org/stable/auto_examples/exercises/digits_classification_exercise.html – maxymoo Apr 21 '16 at 06:34
  • 1
    It was also asked on datascience https://datascience.stackexchange.com/questions/11334/python-how-to-use-multinomial-logistic-regression-using-sklearn – amirouche Mar 20 '18 at 22:24
  • 1
    @amirouche, that appears to be the same OP asking the same Q. – Ejaz May 14 '20 at 18:17

2 Answers2

18

You could try

LogisticRegression(multi_class='multinomial',solver ='newton-cg').fit(X_train,y_train)
Daisy QL
  • 396
  • 3
  • 13
11

LogisticRegression can handle multiple classes out-of-the-box.

X = df[['A', 'B', 'C', 'D']]
y = df['E']
lr = LogisticRegression()
lr.fit(X, y)
preds = lr.predict(X)  # will output array with integer values.
dukebody
  • 7,025
  • 3
  • 36
  • 61