Questions tagged [logistic-regression]

Logistic regression is a statistical classification model used for making categorical predictions.

Logistic regression is a statistical analysis method used for predicting and understanding categorical dependent variables (e.g., true/false, or multinomial outcomes) based on one or more independent variables (e.g., predictors, features, or attributes). The probabilities describing the possible outcomes of a single trial are modeled as a function of the predictors using a logistic function (as it follows):

enter image description here

A logistic regression model can be represented by:

enter image description here

The logistic regression model has the nice property that the exponentiated regression coefficients can be interpreted as odds ratios associated with a one unit increase in the predictor.

Multinomial logistic regression (i.e., with three or more possible outcomes) are also sometimes called Maximum Entropy (MaxEnt) classifiers in the machine learning literature.


Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

3746 questions
8
votes
1 answer

which coefficients go to which class in multiclass logistic regression in scikit learn?

I'm using scikit learn's Logistic Regression for a multiclass problem. logit = LogisticRegression(penalty='l1') logit = logit.fit(X, y) I'm interested in which features are driving this decision. logit.coef_ The above gives me a beautiful…
Alex Lenail
  • 12,992
  • 10
  • 47
  • 79
8
votes
1 answer

Spark2 - LogisticRegression training finished but the result is not converged because: line search failed

While training a Logistic Regression classifier I get the following error: 2016-08-16 20:50:23,833 ERROR [main] optimize.LBFGS (Logger.scala:error(27)) - Failure! Resetting history: breeze.optimize.FirstOrderException: Line search zoom…
8
votes
2 answers

Logistic Regression Scikit-Learn Getting the coefficients of the classification

I am doing Multiclass Classification and applying Logistic regression on it When i fitted the data by calling logistic.fit(InputDATA,OutputDATA) the estimator "logistic " fits the data. Now when I call logistic.coef_ it prints a 2D array with 4…
8
votes
3 answers

Why is logistic regression called regression?

According to what I have understood, linear regression predicts the outcome which can have continuous values, whereas logistic regression predicts outcome which is discrete. It seems to me that logistic regression is similar to a classification…
8
votes
2 answers

glmnet error for logistic regression/binomial

I get this error when trying to fit glmnet() with family="binomial", for Logistic Regression fit: > data <- read.csv("DAFMM_HE16_matrix.csv", header=F) > x <- as.data.frame(data[,1:3]) > x <- model.matrix(~.,data=x) > y <- data[,4] >…
user3889389
8
votes
1 answer

How to get the coefficients from an MLE logit regression?

I have a statsmodels.discrete.discrete_model.BinaryResultsWrapper that was the output of running statsmodels.api.Logit(...).fit(). I can call the .summary() method which prints a table of results with the coefficients embedded in text, but what I…
cas5nq
  • 413
  • 5
  • 11
8
votes
2 answers

MATLAB's glmfit vs fitglm

I'm trying to perform logistic regression to do classification using MATLAB. There seem to be two different methods in MATLAB's statistics toolbox to build a generalized linear model 'glmfit' and 'fitglm'. I can't figure out what the difference is…
8
votes
2 answers

ValueError: data type must provide an itemsize?

My code as follows, every time when I run it , it has an error; "ValueError: data type must provide an itemsize" I can't find the reason why it doesn;t work. I don't know why? from sklearn.linear_model import LogisticRegression trainX = [('2',…
fhlkm
  • 333
  • 1
  • 6
  • 14
8
votes
3 answers

Model runs with glm but not bigglm

I was trying to run a logistic regression on 320,000 rows of data (6 variables). Stepwise model selection on a sample of the data (10000) gives a rather complex model with 5 interaction terms: Y~X1+ X2*X3+ X2*X4+ X2*X5+ X3*X6+ X4*X5. The glm()…
ybeybe
  • 149
  • 1
  • 12
8
votes
1 answer

Detecting mulicollinear , or columns that have linear combinations while modelling in Python : LinAlgError

I am modelling data for a logit model with 34 dependent variables,and it keep throwing in the singular matrix error , as below -: Traceback (most recent call last): File "", line 1, in test_scores =…
ekta
  • 1,560
  • 3
  • 28
  • 57
7
votes
1 answer

Weighted logistic regression in R

Given sample data of proportions of successes plus sample sizes and independent variable(s), I am attempting logistic regression in R. The following code does what I want and seems to give sensible results, but does not look like a sensible…
Henry
  • 6,704
  • 2
  • 23
  • 39
7
votes
2 answers

Simulate data for logistic regression with fixed r2

I would like to simulate data for a logistic regression where I can specify its explained variance beforehand. Have a look at the code below. I simulate four independent variables and specify that each logit coefficient should be of size…
Lion Behrens
  • 199
  • 1
  • 11
7
votes
2 answers

Imbalanced classification: order of oversampling vs. scaling features?

When performing classification (for example, logistic regression) with an imbalanced dataset (e.g., fraud detection), is it best to scale/zscore/standardize the features before over-sampling the minority class, or to balance the classes before…
Steph
  • 79
  • 1
  • 3
7
votes
1 answer

Logistic Regression Tuning Parameter Grid in R Caret Package?

I am trying to fit a logistic regression model in R using the caret package. I have done the following: model <- train(dec_var ~., data=vars, method="glm", family="binomial", trControl = ctrl, tuneGrid=expand.grid(C=c(0.001, 0.01,…
Jane Sully
  • 3,137
  • 10
  • 48
  • 87
7
votes
1 answer

Java Spark MLlib: There is an error "ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory:" for Logistic Regression in ml library

I just tried to use Apache Spark ml library for Logistic Regression, but whenever I tried it, there was an error message, such as "ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory: " The example of data set for logistic…