12

Can one create such an instance based on existing coefficients which were calculated say in a different implementation (e.g. Java)?

I tried creating an instance then setting coef_ and intercept_ directly and it seems to work but I'm not sure if there's a down side here or if I might be breaking something.

jonathans
  • 320
  • 3
  • 9
  • 3
    As long as the predict function for your regression only uses those variables that you set, you should be fine without fitting. – Philip Massey Jun 26 '14 at 19:59
  • 4
    To test this, you can run a small logistic regression in sklearn, then create a new logistic regression object and set `coef_` and `intercept_` as you did, and then compare the two in prediction. If it runs (this is not a given, very difficult with e.g. SVM), then I don't see why it shouldn't work. – eickenberg Jun 26 '14 at 20:07

1 Answers1

7

Yes, it works okay:

import numpy as np
from scipy.stats import norm
from sklearn.linear_model import LogisticRegression
import json
x = np.arange(10)[:, np.newaxis]
y = np.array([0,0,0,1,0,0,1,1,1,1])
# training one logistic regression
model1 = LogisticRegression(C=10, penalty='l1').fit(x, y)
# serialize coefficients (imitate loading from storage)
encoded = json.dumps((model1.coef_.tolist(), model1.intercept_.tolist(), model1.penalty, model1.C))
print(encoded)
decoded = json.loads(encoded)
# using coefficients in another regression
model2 = LogisticRegression()
model2.coef_ = np.array(decoded[0])
model2.intercept_ = np.array(decoded[1])
model2.penalty = decoded[2]
model2.C = decoded[3]
# resulting predictions are identical
print(model1.predict_proba(x) == model2.predict_proba(x))

Output:

[[[0.7558780101653273]], [-3.322083150375962], "l1", 10]
[[ True  True]
 [ True  True]
 [ True  True]
 [ True  True]
 [ True  True]
 [ True  True]
 [ True  True]
 [ True  True]
 [ True  True]
 [ True  True]]

So predictions of original and re-created models are indeed identical.

David Dale
  • 10,958
  • 44
  • 73
  • 2
    If I may add, this solution may not work on some versions of sklearn. I just tried it on `scikit-learn 0.24.2`, and your solution gives an attribute error, stating that the logisticregression object has no attribute "classes_". The solution is to set it with the classes you need, for example: `model2.classes_ = np.array([0, 1])`. – eduardokapp Feb 10 '22 at 18:49