Ordinal logistic regression: Intercept_ returns [1] instead of [n]

Question

I'm running an ordinal (i.e. multinomial) ridge regression using mord (scikitlearn) library.

y is a single column containing integer values from 1 to 19.

X is made of 7 numerical variables binned in 4 buckets, and dummied into a final of 28 binary variables.

import pandas as pd
import numpy as np    
from sklearn import metrics
from sklearn.model_selection import train_test_split
import mord

in_X, out_X, in_y, out_y = train_test_split(X, y,
                                            stratify=y,
                                            test_size=0.3,
                                            random_state=42)

mul_lr = mord.OrdinalRidge(alpha=1.0,
                           fit_intercept=True,
                           normalize=False,
                           copy_X=True,
                           max_iter=None,
                           tol=0.001,
                           solver='auto').fit(in_X, in_y)

mul_lr.coef_ returns a [28 x 1] array but mul_lr.intercept_ returns a single value (instead of 19).

Any Idea what I am missing?

Not familiar with `mord`, just trying to understand why exactly you expect your intercept to be of dimension 18. — desertnaut, Feb 28 '19 at 16:52
Me neither and documentation is scarce, from what I gather it uses the same API as sklearn, only replaces a few parameters here and there to circunvent some limitations. I'm expecting 19 intercept (one per y feature) thanks for pointing that out. — Adav, Feb 28 '19 at 17:12
"`y` taking values from 1 to 19" and "y has 19 features" are two completely different things. I cannot see how your `y` here can have 19 features (you say it's a single column), and hence why the regression should come with 19 different intercept coefficients... — desertnaut, Feb 28 '19 at 17:14
y is only integers which represent a category, could be A to S instead of 1 to 19 — Adav, Feb 28 '19 at 17:16
OK, it does not change the argument - these are not *features*, just different possible values — desertnaut, Feb 28 '19 at 17:17
meant y has 19 classes/categories (had to lookup the translation) hence I expect 19 intercept — Adav, Feb 28 '19 at 17:22
You shouldn't! Only one intercept is expected here, exactly as returned. — desertnaut, Feb 28 '19 at 17:24
Well, everything points towards having at least k-1 itercept, see here : https://stats.idre.ucla.edu/sas/output/ordered-logistic-regression/ 3 possible y => 2 intercept — Adav, Mar 03 '19 at 11:15
I agree with Adav, there should be k-1 intercepts as shown here http://r-statistics.co/Ordinal-Logistic-Regression-With-R.html#:~:text=r%2Dstatistics.co%20by%20Selva%20Prabhakaran&text=Ordinal%20logistic%20regression%20can%20be,use%20case%20is%20described%20below. — user3889486, Aug 17 '20 at 22:32

keineahnung2345 · Accepted Answer · 2019-03-05T09:40:21.267

4

If you would like your model to predict for all 19 categories, you need to first convert your label y to one hot encoding before training a model.

from sklearn.preprocessing import OneHotEncoder

y-=1 # range from 1 to 19 -> range from 0 to 18
enc = OneHotEncoder(n_values=19)
y = enc.fit_transform(y).toarray()
"""
train a model
"""

Now mul_lr.intercept_.shape should be (19,).

edited Mar 05 '19 at 09:40

answered Mar 05 '19 at 09:19

keineahnung2345

2,635
4
13
28

That's probably it, looking it up and gettin back to you. Thanks a lot – Adav Mar 05 '19 at 10:28

Ordinal logistic regression: Intercept_ returns [1] instead of [n]

1 Answers1