0

I trained a model using scikit-learn's LogisticRegression classifier (multinomial/multiclass). I then saved the coefficients from the model to a file. Next, I loaded the coefficients into my own self-implementation of softmax, which is what scikit-learn's documentation claims is used by the Logistic Regression classifier for the multinomial case. However, the predictions do not align.

  1. Training mlogit model with scikit-learn
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import json

# Split data into train-test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# Train model
mlr = LogisticRegression(random_state=21, multi_class='multinomial', solver='newton-cg')
mlr.fit(X_train, y_train)
y_pred = mlr.predict(X_test)

# Save test data and coefficients
json.dump(X_test.tolist(), open('X_test.json'), 'w'), indent=4)
json.dump(y_pred.tolist(), open('y_pred.json'), 'w'), indent=4)
json.dump(mlr.classes_.tolist(), open('classes.json'), 'w'), indent=4)
json.dump(mlr.coef_.tolist(), open('weights.json'), 'w'), indent=4)
  1. Self-implementation of softmax via Scipy
from scipy.special import softmax
import numpy as np
import json

def predict(x, w, classes):
    z = np.dot(x, np.transpose(w))
    sm = softmax(z)
    return [classes[i] for i in sm.argmax(axis=1)]

x = json.load(open('X_test.json'))
w = json.load(open('weights.json'))
classes = json.load(open('classes.json'))

y_pred_self = predict(x, w, classes)
  1. Results do not match Essentially, when I compare y_pred_self with y_pred, they are not the same (about 85% similar).

So my question is whether the scikit-learn softmax or predict implementation has some non-standard/hidden tweaks?

Side-note: I have also tried a self-implementation in Ruby and it also gives predictions that are off.

Hamman Samuel
  • 2,350
  • 4
  • 30
  • 41

1 Answers1

2

There are some differences that I've seen at a first glance. Please have a look at the following points:

1. Regularization
According to the docs scikit-learn uses a regularization term:

This class implements regularized logistic regression [...]. Note that regularization is applied by default.

So you could deactivate the regularization term from the scikit-learn implementation or add regularization to your own implementation.

2. Bias
In the docs you can read that a bias term is used:

fit_interceptbool, default=True
Specifies if a constant (a.k.a. bias or intercept) should be added to the decision function.

So you could deactivate the bias in the scikit-learn implementation or add the bias term to your implementation.

Maybe use a well-known dataset from the scikit-learn library or provide your dataset, so it is easier to reproduce the problem. Let me know how it worked.

meph
  • 662
  • 3
  • 10
  • Great answer! The solution was to incorporate the bias into the computation. So I saved the bias values via `json.dump(mlr.intercept_.tolist(), open('bias.json'), 'w'), indent=4)` and also modified the dot-product portion of my `predict(x, w, classes, bias)` function to `z = np.dot(x, np.transpose(w))+bias`. Now I have 100% alignment between both solutions' predictions! – Hamman Samuel Feb 28 '20 at 02:45