Questions tagged [logistic-regression]

Logistic regression is a statistical classification model used for making categorical predictions.

Logistic regression is a statistical analysis method used for predicting and understanding categorical dependent variables (e.g., true/false, or multinomial outcomes) based on one or more independent variables (e.g., predictors, features, or attributes). The probabilities describing the possible outcomes of a single trial are modeled as a function of the predictors using a logistic function (as it follows):

enter image description here

A logistic regression model can be represented by:

enter image description here

The logistic regression model has the nice property that the exponentiated regression coefficients can be interpreted as odds ratios associated with a one unit increase in the predictor.

Multinomial logistic regression (i.e., with three or more possible outcomes) are also sometimes called Maximum Entropy (MaxEnt) classifiers in the machine learning literature.


Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

3746 questions
15
votes
3 answers

Using categorical data as features in sklean LogisticRegression

I'm trying to understand how to use categorical data as features in sklearn.linear_model's LogisticRegression. I understand of course I need to encode it. What I don't understand is how to pass the encoded feature to the Logistic regression so it's…
14
votes
1 answer

Error in summary.connection(connection) : invalid connection

Having an issue while running a logistic regression model using caret::train(). LR = caret::train(Satisfaction ~., data= log_train, method = "glm", preProcess = c("scale"), family="binomial") keep getting the following line of error: Error in…
Manasi bhargav
  • 141
  • 1
  • 1
  • 3
14
votes
1 answer

What are X_train and y_train?

I want to start develop an application using Machine Learning. I want to classify text - spam or not spam. I have 2 files - spam.txt, ham.txt - that contain thousand of sentences each file. If I want to use a classifier, let's say…
user9886692
14
votes
5 answers

Why do weight parameters of logistic regression get initialized to zeros?

I have seen the weights of neural networks initialized to random numbers so I am curious why the weights of logistic regression get initialized to zeros?
Alaa Awad
  • 3,612
  • 6
  • 25
  • 35
14
votes
3 answers

Can we use Normal Equation for Logistic Regression ?

Just like we use the Normal Equation to find out the optimum theta value in Linear Regression, can/can't we use a similar formula for Logistic Regression ? If not, why ? I'd be grateful if could someone could explain the reasoning behind it. Thank…
user2125722
  • 1,289
  • 3
  • 18
  • 29
14
votes
1 answer

How to get comparable and reproducible results from LogisticRegressionCV and GridSearchCV

I want to score different classifiers with different parameters. For speedup on LogisticRegression I use LogisticRegressionCV (which at least 2x faster) and plan use GridSearchCV for others. But problem while it give me equal C parameters, but not…
13
votes
1 answer

Keras and Sklearn logreg returning different results

I'm comparing the results of a logistic regressor written in Keras to the default Sklearn Logreg. My input is one-dimensional. My output has two classes and I'm interested in the probability that the output belongs to the class 1. I'm expecting the…
Darina
  • 1,488
  • 8
  • 17
13
votes
1 answer

Using R for multi-class logistic regression

Short format: How to implement multi-class logistic regression classification algorithms via gradient descent in R? Can optim() be used when there are more than two labels? The MatLab code is: function [J, grad] = cost(theta, X, y, lambda) m =…
Antoni Parellada
  • 4,253
  • 6
  • 49
  • 114
13
votes
2 answers

Regression (logistic) in R: Finding x value (predictor) for a particular y value (outcome)

I've fitted a logistic regression model that predicts the a binary outcome vs from mpg (mtcars dataset). The plot is shown below. How can I determine the mpg value for any particular vs value? For example, I'm interested in finding out what the mpg…
hsl
  • 670
  • 2
  • 10
  • 22
13
votes
5 answers

What are alternatives of Gradient Descent?

Gradient Descent has a problem of Local Minima. We need run gradient descent exponential times for to find global minima. Can anybody tell me about any alternatives of gradient descent with their pros and cons. Thanks.
13
votes
2 answers

Vectorization of logistic regression cost

I have this code for the cost in logistic regression, in matlab: function [J, grad] = costFunction(theta, X, y) m = length(y); % number of training examples thetas = size(theta,1); features = size(X,2); steps = 100; alpha = 0.1; J = 0; grad =…
Pedro.Alonso
  • 1,007
  • 3
  • 20
  • 41
12
votes
1 answer

Default starting values fitting logistic regression with glm

I'm wondering how are default starting values specified in glm. This post suggests that default values are set as zeros. This one says that there is an algorithm behind it, however relevant link is broken. I tried to fit simple logistic regression…
Adela
  • 1,757
  • 19
  • 37
12
votes
9 answers

Finding coefficients for logistic regression

I'm working on a classification problem and need the coefficients of the logistic regression equation. I can find the coefficients in R but I need to submit the project in python. How to get the coefficient values in scikit-learn?
MonkeyDLuffy
  • 508
  • 1
  • 5
  • 24
12
votes
2 answers

Pytorch inputs for nn.CrossEntropyLoss()

I am trying to perform a Logistic Regression in PyTorch on a simple 0,1 labelled dataset. The criterion or loss is defined as: criterion = nn.CrossEntropyLoss(). The model is: model = LogisticRegression(1,2) I have a data point which is a pair: dat…
12
votes
2 answers

How to get the weight vector in Logistic Regression?

I have a X feature matrix and a y label matrix and I am using binary logistic regression how can I get the weight vector w given matrix X feature and Y label matrix. I am a bit confused as to how achieve this within sklean. How do I solve the…