Questions tagged [logistic-regression]

Logistic regression is a statistical classification model used for making categorical predictions.

Logistic regression is a statistical analysis method used for predicting and understanding categorical dependent variables (e.g., true/false, or multinomial outcomes) based on one or more independent variables (e.g., predictors, features, or attributes). The probabilities describing the possible outcomes of a single trial are modeled as a function of the predictors using a logistic function (as it follows):

enter image description here

A logistic regression model can be represented by:

enter image description here

The logistic regression model has the nice property that the exponentiated regression coefficients can be interpreted as odds ratios associated with a one unit increase in the predictor.

Multinomial logistic regression (i.e., with three or more possible outcomes) are also sometimes called Maximum Entropy (MaxEnt) classifiers in the machine learning literature.


Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

3746 questions
9
votes
1 answer

scikit-learn refit/partial fit option in Classifers

I am wondering is there any option in sklearn classifiers to fit using some hyperparameters and after changing a few hyperparameter(s), refit the model by saving computation (fit) cost. Let us say, Logistic Regression is fit using C=1e5…
9
votes
6 answers

python divide by zero encountered in log - logistic regression

I'm trying to implement a multiclass logistic regression classifier that distinguishes between k different classes. This is my code. import numpy as np from scipy.special import expit def cost(X,y,theta,regTerm): (m,n) = X.shape J =…
9
votes
1 answer

LogisticRegressionModel prediction manually

I was trying to predict a label for every row in a DataFrame, but without using the LinearRegressionModel's transform method, due to ulterior motives, instead I was trying to compute it manually by using the classic formula 1 / (1 + e^(-hθ(x))),…
Alberto Bonsanto
  • 17,556
  • 10
  • 64
  • 93
9
votes
1 answer

ggplot2: How to combine histogram, rug plot, and logistic regression prediction in a single graph

I am trying to plot combined graphs for logistic regressions as the function logi.hist.plot but I would like to do it using ggplot2 (aesthetic reasons). The problem is that only one of the histograms should have the scale_y_reverse(). Is there any…
ChJulian
  • 115
  • 1
  • 5
9
votes
1 answer

Unable to run logistic regression due to "perfect separation error"

I'm a beginner to data analysis in Python and have been having trouble with this particular assignment. I've searched quite widely, but have not been able to identify what's wrong. I imported a file and set it up as a dataframe. Cleaned the data…
Ajay Gopalan
  • 103
  • 1
  • 1
  • 5
9
votes
2 answers

Scikit F-score metric error

I am trying to predict a set of labels using Logistic Regression from SciKit. My data is really imbalanced (there are many more '0' than '1' labels) so I have to use the F1 score metric during the cross-validation step to "balance" the…
9
votes
1 answer

Different versions of sklearn give quite different training results

We upgraded our sklearn from the old 0.13-git to 0.14.1, and find the performance of our logistic regression classifier changed quite a bit. The two classifiers trained with the same data have different coefficients, and thus often give different…
ymeng
  • 123
  • 7
9
votes
1 answer

Multivariate logistic regression in r?

How does one perform a multivariate (multiple dependent variables) logistic regression in R? I know you do this for linear regression, and this works form <-cbind(A,B,C,D)~shopping_pt+price mlm.model.1 <- lm(form, data = train) But when I try the…
blast00
  • 559
  • 2
  • 8
  • 18
9
votes
1 answer

Simple binary logistic regression using MATLAB

I'm working on doing a logistic regression using MATLAB for a simple classification problem. My covariate is one continuous variable ranging between 0 and 1, while my categorical response is a binary variable of 0 (incorrect) or 1 (correct). I'm…
9
votes
3 answers

Is my implementation of stochastic gradient descent correct?

I am trying to develop stochastic gradient descent, but I don't know if it is 100% correct. The cost generated by my stochastic gradient descent algorithm is sometimes very far from the one generated by FMINUC or Batch gradient descent. while batch…
8
votes
1 answer

logistic regression and GridSearchCV using python sklearn

I am trying code from this page. I ran up to the part LR (tf-idf) and got the similar results After that I decided to try GridSearchCV. My questions below: 1) #lets try…
user2543622
  • 5,760
  • 25
  • 91
  • 159
8
votes
1 answer

How to use weights in a logistic regression

I want to calculate (weighted) logistic regression in Python. The weights were calculated to adjust the distribution of the sample regarding the population. However, the results don´t change if I use weights. import numpy as np import pandas as pd …
Banjo
  • 1,191
  • 1
  • 11
  • 28
8
votes
3 answers

Different coefficients: scikit-learn vs statsmodels (logistic regression)

When running a logistic regression, the coefficients I get using statsmodels are correct (verified them with some course material). However, I am unable to get the same coefficients with sklearn. I've tried preprocessing the data to no avail. This…
lfo
  • 196
  • 2
  • 10
8
votes
1 answer

Load and predict new data sklearn

I trained a Logistic model, cross-validated and saved it to file using joblib module. Now I want to load this model and predict new data with it. Is this the correct way to do this? Especially the standardization. Should I use scaler.fit() on my…
8
votes
2 answers

Getting weights of features using scikit-learn Logistic Regression

I am a little new to this. I am using a simple Logistic Regression Classifier in python scikit-learn. I have 4 features. My code is X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 42) classifier =…