Questions tagged [logistic-regression]

Logistic regression is a statistical classification model used for making categorical predictions.

Logistic regression is a statistical analysis method used for predicting and understanding categorical dependent variables (e.g., true/false, or multinomial outcomes) based on one or more independent variables (e.g., predictors, features, or attributes). The probabilities describing the possible outcomes of a single trial are modeled as a function of the predictors using a logistic function (as it follows):

enter image description here

A logistic regression model can be represented by:

enter image description here

The logistic regression model has the nice property that the exponentiated regression coefficients can be interpreted as odds ratios associated with a one unit increase in the predictor.

Multinomial logistic regression (i.e., with three or more possible outcomes) are also sometimes called Maximum Entropy (MaxEnt) classifiers in the machine learning literature.


Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

3746 questions
1
vote
1 answer

Comparing percent change of model coefficients

I am working through step 3 of purposeful model-building from Hosmer-Lemeshow and it suggests to compare the percent change in coefficients between a full model [Iris.mod1] and a reduced model [Iris.mod2]. I would like to automate this step if…
Aaron England
  • 1,223
  • 1
  • 14
  • 26
1
vote
0 answers

Pyspark Logistic Regression has zero coefficients after fitting

Good afternoon. I am solving a multi-label classification problem with the help of LogisticRegression in pyspark. However, after I fit a model to the data, all elements of the CoefficientMatrix of the model are zeroes. I noticed, that if I decrease…
1
vote
1 answer

binomial mixed effects model BIC - R vs SPSS

I'm trying to calculate Bayes Factor from my data and I'm getting very different results in R and SPSS for my mixed effects model. It's fine for a linear one, but not binomial. Here is the R code: ``memory.model = glmer(correct ~ (1|ps) + (1|item),…
Agata
  • 343
  • 3
  • 13
1
vote
0 answers

IllegalArgumentException While Obtaining Roc for Logistic Regression at Spark

I've created a Dataset: trainingData.show(10); it has that value: +------------------+-----+ | features|label| +------------------+-----+ |[2.0,2.0,1.0,24.0]| 1| |[2.0,2.0,2.0,26.0]| 1| |[2.0,2.0,2.0,34.0]| 0| |[2.0,2.0,1.0,37.0]|…
1
vote
0 answers

How to subset a data frame in R by two conditions and then apply those variables to a regression?

I have a data frame of different variables in R that represent indicators such as race, SAT score, and high school GPA, dropout rate, and gender. I am trying to regress dropout rate using these as right hand side inputs. However, I am only trying to…
Ghost Koi
  • 111
  • 1
1
vote
0 answers

Calculate marginal effect for GLM (logistic) models in R

I came across 2 packages to calculate marginal effect for a logistic regression model in R with some interaction terms. margins package https://cran.r-project.org/web/packages/margins/vignettes/Introduction.html and mfx pacakge…
NAN
  • 11
  • 3
1
vote
0 answers

binary logistic regression - model selection basics

I have binary outcome variable and 4 predictors: 2 binary one and 2 continuous (truncated to whole numbers). I have some 1158 observations and the objective of the analysis is to predict the probability of the binary outcome (infection), check…
user1607
  • 531
  • 7
  • 28
1
vote
0 answers

Cost value doesn't converge

I'm trying code a logistic regression but I'm in trouble getting a convergent COST, can anyone help me? Below are my codes. Thank you! #input: m = 3, n = 4 # we have 3 training examples and each of them has 4 features (Sorry, I know it looks weired…
D. Wei
  • 79
  • 1
  • 8
1
vote
0 answers

Python how to plot each KFold confusion matrix

how to write a For Loop to show each KFold's confusion matrix, so that I can analyze why some of the recall scores are 0. the code below for KFold from sklearn import model_selection from sklearn.model_selection import cross_val_score kfold =…
BigData
  • 397
  • 2
  • 3
  • 13
1
vote
1 answer

How to find the value of a covariate specific to .5 probability on Logistical Regression

So, I have a binomial glm function, with two predictors, the second (as factor) being with two levels (50, 250). model <- glm(GetResp.RESP ~ speed + PadLen, family = binomial(link = "logit"), data = myData) The plot for it looks like this: My…
1
vote
2 answers

how do I find the actual logistic regression model in python?

I used logistic regression with python and got an accuracy score of 95%, how do I get this equation so that I can actually implement it? I wrote: model =…
Christian
  • 83
  • 4
1
vote
1 answer

Logistic Regression: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples

Goal: Determine if rfq_num_of_dealers is a significant predictor of a Done trade (Done =1). My Data: df_Train_Test.info() Int64Index: 139025 entries, 0 to 139024 Data columns (total 2…
Peter Lucas
  • 1,979
  • 1
  • 16
  • 27
1
vote
1 answer

How to process feature vectors with different dimension in machine learning?

I'm a beginner in machine learning, and I'm trying to use a data set to train a log linear classifier. The data set contains five features, and each feature is a vector, but the dimension of the features are different. The dimensions are 3, 1, 6, 2,…
1
vote
2 answers

Logistic Regression is sensitive to outliers? Using on synthetic 2D dataset

I am currently using sklearn's Logistic Regression function to work on a synthetic 2d problem. The dataset is shown as below: I'm basic plugging the data into sklearn's model, and this is what I'm getting (the light green; disregard the dark…
1
vote
1 answer

How do i fix the missing output (NA) in my summary/coefficients table R

I was building a logistic regression model in r but when I checked the coefficients using summary(model) the output displayed NA's in the four columns (estimate, standard error, z value and z) for one of my independent variables. My other three…
Poly
  • 13
  • 1
  • 4