Questions tagged [glm]

For questions relating to generalized linear models. For the GLM math library, see the [glm-math] tag.

Generalized linear models are a class that encompasses a variety of standard statistical models, including ordinary least squares (a.k.a. linear models, OLS) regression, probit, logistic regression, Poisson regression, and other methods that can be expressed in the standard GLM form.

Consider whether your question is better suited to Cross Validated, the Stack Exchange site for statistics and machine learning. Questions on Stack Overflow should be about programming issues arising from fitting models to data.

In scientific software for statistical computing and graphics, a GLM can be estimated by the function glm.

2019 questions
5
votes
1 answer

Logistic Regression in R: glm() vs rxGlm()

I fit a lot of GLMs in R. Usually I used revoScaleR::rxGlm() for this because I work with large data sets and use quite complex model formulae - and glm() just won't cope. In the past these have all been based on Poisson or gamma error structures…
Alan
  • 619
  • 6
  • 19
5
votes
1 answer

revoScaleR::rxGlm() Question in R - GLM Residuals

I might not find an answer here because I don't think the revoScaleR package is widely used. If I create a GLM using rxGlm() it works fine. However the model residuals available via rxPredict() seem to just be the "raw" residuals, ie observed value…
Alan
  • 619
  • 6
  • 19
5
votes
1 answer

Syntax for binomial formula in geom_smooth

I have computed a binomial regression in R: Call: glm(formula = cbind(success, failure) ~ x * f, family = "binomial", data = tb1) Deviance Residuals: Min 1Q Median 3Q Max -3.6195 -0.9399 -0.0493 0.5698 2.0677 …
Igor F.
  • 2,649
  • 2
  • 31
  • 39
5
votes
1 answer

Standardizing qualitative variables in R to perform glm's, glm.nb's and lm's

I want to standardize the variables of a biological dataset. I need to run glm's, glm.nb's and lm's using different response variables. The dataset contains counts of a given tree species by plots (all the the plots have the same size) and a series…
Darius
  • 489
  • 2
  • 6
  • 22
5
votes
1 answer

Fast Wald confidence intervals for a glm with broom in R

I would like to calculate Wald confidence intervals of the coefficients of a glm on a somewhat large data set, and use broom for a tidy output. mydata <- data.frame(y = rbinom(1e5,1,0.8), x1 = rnorm(1e5), x2 =…
bebru
  • 151
  • 9
5
votes
2 answers

R predict glm fit on each column in data frame using column index number

Trying to fit BLR model to each column in data frame, and then predict on new data pts. Have a lot of columns, so cannot identify the columns by name, only column number. Having reviewed the several examples of similar nature on this site, cannot…
bici-sancta
  • 172
  • 1
  • 11
5
votes
1 answer

Different result from the same regression

Why do I get different results from summary(lm(mpg~horsepower + I(horsepower^2),data = Auto))$coef and summary(lm(mpg∼poly(horsepower,2) ,data=Auto))$coef PS: I'm practicing the labs of ISLR
A_plus
  • 93
  • 6
5
votes
1 answer

Avoid failing when a factor has new levels in test set

I have a dataset, which I am splitting into train and test subsets in the following way: train_ind <- sample(seq_len(nrow(dataset)), size=(2/3)*nrow(dataset)) train <- dataset[train_ind] test <- dataset[-train_ind] Then, I use it to train a…
Setzer22
  • 1,589
  • 2
  • 13
  • 29
5
votes
1 answer

Running a GLM with a Gamma distribution, but data includes zeros

I'm trying to run a GLM in R for biomass data (reductive biomass and ratio of reproductive biomass to vegetative biomass) as a function of habitat type ("hab"), year data was collected ("year"), and site of data collection ("site"). My data looks…
Laura
  • 63
  • 1
  • 2
  • 7
5
votes
1 answer

GLM gamma regression in Python statsmodels

Consider the GLM gamma function fitting in Python package statsmodel. Here is the code: import numpy import statsmodels.api as sm model = sm.GLM(ytrain, xtrain, family=sm.families.Gamma(link = sm.genmod.families.links.identity)).fit() print…
rajatsen91
  • 222
  • 2
  • 9
5
votes
1 answer

How can one debug/get logs for failures of the SparkR Java backend?

I'm bedeviled by a No status is returned. Java SparkR backend might have failed. error when fitting a glm using Spark. The job actually appears to run to completion based on the Spark web ui, but at some point during model fit (it doesn't appear to…
russellpierce
  • 4,583
  • 2
  • 32
  • 44
5
votes
0 answers

parallelizing loop for GLM in R

I am trying to program a parallelized for loop where inside I am trying to optimally find the best GLM to model only the variables that have the lowest p-value to see whether or not I am going to play tennis (yes/no in binary). For example, I have…
lurodrig
  • 99
  • 3
  • 8
5
votes
1 answer

GLMER: Error: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

I am studying impact of various characteristics on court decission on specific offences. The dataset is pretty large (28928 observations with 86 level-2 units). I am looking at the decision whether to incarcerate someone or not (=binary outcome…
Jakub Drapal
  • 233
  • 2
  • 3
  • 8
5
votes
1 answer

LC50 / LD50 confidence intervals from multiple regression glm with interaction

I have a quasibinomial glm with two continuous explanatory variables (let's say "LogPesticide" and "LogFood") and an interaction. I would like to calculate the LC50 of the pesticide with confidence intervals at different amounts of food (e. g. the…
Jeremias
  • 73
  • 1
  • 4
5
votes
1 answer

Set G in prior using MCMCglmm, with categorical response and phylogeny

I am new to the MCMCglmm package in R, and rather new to glm models in general. I have a dataset of species traits and whether or not they have been introduced outside of their native range. I would like to test whether being introduced (as a binary…
Mila
  • 51
  • 5