Questions tagged [logistic-regression]

Logistic regression is a statistical classification model used for making categorical predictions.

Logistic regression is a statistical analysis method used for predicting and understanding categorical dependent variables (e.g., true/false, or multinomial outcomes) based on one or more independent variables (e.g., predictors, features, or attributes). The probabilities describing the possible outcomes of a single trial are modeled as a function of the predictors using a logistic function (as it follows):

enter image description here

A logistic regression model can be represented by:

enter image description here

The logistic regression model has the nice property that the exponentiated regression coefficients can be interpreted as odds ratios associated with a one unit increase in the predictor.

Multinomial logistic regression (i.e., with three or more possible outcomes) are also sometimes called Maximum Entropy (MaxEnt) classifiers in the machine learning literature.


Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

3746 questions
18
votes
5 answers

Speeding up sklearn logistic regression

I have a model I'm trying to build using LogisticRegression in sklearn that has a couple thousand features and approximately 60,000 samples. I'm trying to fit the model and it's been running for about 10 mins now. The machine I'm running it on has…
sedavidw
  • 11,116
  • 13
  • 61
  • 95
17
votes
6 answers

ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0.0

I have applied Logistic Regression on train set after splitting the data set into test and train sets, but I got the above error. I tried to work it out, and when i tried to print my response vector y_train in the console it prints integer values…
17
votes
2 answers

Python : How to use Multinomial Logistic Regression using SKlearn

I have a test dataset and train dataset as below. I have provided a sample data with min records, but my data has than 1000's of records. Here E is my target variable which I need to predict using an algorithm. It has only four categories like…
17
votes
1 answer

scikit learn: how to check coefficients significance

i tried to do a LR with SKLearn for a rather large dataset with ~600 dummy and only few interval variables (and 300 K lines in my dataset) and the resulting confusion matrix looks suspicious. I wanted to check the significance of the returned…
dadam
  • 181
  • 1
  • 1
  • 6
17
votes
5 answers

stratified splitting the data

I have a large data set and like to fit different logistic regression for each City, one of the column in my data. The following 70/30 split works without considering City group. indexes <- sample(1:nrow(data), size = 0.7*nrow(data)) train <-…
user35577
  • 191
  • 1
  • 1
  • 7
16
votes
1 answer

Mixed effects logistic regression

I'm attempting to implement mixed effects logistic regression in python. As a point of comparison, I'm using the glmer function from the lme4 package in R. I've found that the statsmodels module has a BinomialBayesMixedGLM that should be able to fit…
YTD
  • 457
  • 3
  • 11
16
votes
1 answer

What does sklearn "RidgeClassifier" do?

I'm trying to understand the difference between RidgeClassifier and LogisticRegression in sklearn.linear_model. I couldn't find it in the documentation. I think I understand quite well what the LogisticRegression does.It computes the coefficients…
16
votes
1 answer

'DataFrame' object has no attribute 'ravel' when transforming target variable?

I was fitting a logistic regression with a subset dataset. After splitting the dataset and fitting the model, I got a error message of the following: /Users/Eddie/anaconda/lib/python3.4/site-packages/sklearn/utils/validation.py:526:…
Edward Lin
  • 609
  • 1
  • 9
  • 16
16
votes
2 answers

Confidence interval of probability prediction from logistic regression statsmodels

I'm trying to recreate a plot from An Introduction to Statistical Learning and I'm having trouble figuring out how to calculate the confidence interval for a probability prediction. Specifically, I'm trying to recreate the right-hand panel of this…
Taylor
  • 378
  • 2
  • 4
  • 14
16
votes
3 answers

one vs all regression

I have been reviewing an example from the course of Andrew Ng in Machine Learning which I found in https://github.com/jcgillespie/Coursera-Machine-Learning/tree/master/ex3. The example deals with logistic regression and one-vs-all classification. I…
Little
  • 3,363
  • 10
  • 45
  • 74
16
votes
3 answers

Spark Java Error: Size exceeds Integer.MAX_VALUE

I am trying to use spark for some simple machine learning task. I used pyspark and spark 1.2.0 to do a simple logistic regression problem. I have 1.2 million records for training, and I hashed the features of the records. When I set the number of…
16
votes
1 answer

Different Robust Standard Errors of Logit Regression in Stata and R

I am trying to replicate a logit regression from Stata to R. In Stata I use the option "robust" to have the robust standard error (heteroscedasticity-consistent standard error). I am able to replicate the exactly same coefficients from Stata, but I…
chl111
  • 468
  • 3
  • 14
16
votes
1 answer

Correctness of logistic regression in Vowpal Wabbit?

I have started using Vowpal Wabbit for logistic regression, however I am unable to reproduce the results it gives. Perhaps there is some undocumented "magic" it does, but has anyone been able to replicate / verify / check the calculations for…
sling
  • 163
  • 1
  • 4
15
votes
5 answers

Error : PerfectSeparationError: Perfect separation detected, results not available

This is the head of a train data set. Head of the X_Train Running the below code: logit = sm.GLM(Y_train, X_train, family=sm.families.Binomial()) result = logit.fit() Can you please help? Getting the below error : Error Screen Shot
Dipannita Banerjee
  • 151
  • 1
  • 1
  • 4
15
votes
3 answers

How to perform logistic lasso in python?

The scikit-learn package provides the functions Lasso() and LassoCV() but no option to fit a logistic function instead of a linear one...How to perform logistic lasso in python?
Fringant
  • 525
  • 1
  • 5
  • 17