7

Why there is no possibility to pass only 1 explanatory variable to model in glmnet function from glmnet package when it is possible in glm function from base? Code and error are below:

> modelX<-glm( ifelse(train$cliks <1,0,1)~(sparseYY[,40]), family="binomial")
> summary(modelX)

Call:
glm(formula = ifelse(train$cliks < 1, 0, 1) ~ (sparseYY[, 40]), 
    family = "binomial")

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.2076  -0.2076  -0.2076  -0.2076   2.8641  

Coefficients:
               Estimate Std. Error  z value Pr(>|z|)    
(Intercept)    -3.82627    0.00823 -464.896   <2e-16 ***
sparseYY[, 40] -0.25844    0.15962   -1.619    0.105    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 146326  on 709677  degrees of freedom
Residual deviance: 146323  on 709676  degrees of freedom
AIC: 146327

Number of Fisher Scoring iterations: 6

> modelY<-glmnet( y =ifelse(train$cliks <1,0,1), x =(sparseYY[,40]), family="binomial"  )
Błąd wif (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns")
Marcin
  • 7,834
  • 8
  • 52
  • 99
  • It should be noted that you can bind an all 0 column to a one column x variable and glmnet will yield the appropriate 1st coefficient and a coefficient of zero for the all 0 column. ```x = cbind(sparseYY[, 40], 0)``` – Wart Sep 27 '17 at 20:28
  • 2
    The `glmnet` package implements regularization methods. What would be the purpose of applying LASSO or rigde to fit a model with only one explanatory variable? Why would you want to shrink your one coefficient (ridge) or set it equal to zero (LASSO)? These methods only start to make sense at `k >= 2`. – suckrates Jan 02 '18 at 11:01
  • @AlvaroFuentes fair enough. My mind had to be limited that day.. – Marcin Jan 11 '18 at 11:51

3 Answers3

11

Here is an answer I got to this question from the maintainer of the package (Trevor Hastie):

glmnet is designed to select variables from a (large) collection. Allowing for 1 variable would have created a lot of edge case programming, and I was not interested in doing that. Sorry!

tobsecret
  • 2,442
  • 15
  • 26
1

I don't know why, but it's some kind of internal limitation. It does not have to do with the family as Roman claimed above.

glmnet(x = as.matrix(iris[2:4]), y = as.matrix(iris[1]))
## long output
glmnet(x = as.matrix(iris[1]), y = as.matrix(iris[1]))
Error in glmnet(x = as.matrix(iris[2]), y = as.matrix(iris[1])) : 
  x should be a matrix with 2 or more columns

It's a simple check in the code https://github.com/cran/glmnet/blob/master/R/glmnet.R#L20

CoderGuy123
  • 6,219
  • 5
  • 59
  • 89
-2

Because the documentation says so.

For family="binomial" should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class).

You have two options. Either construct a matrix where two columns represent counts, or, convert x into a factor with two levels.

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • Still doesn't work > modelY<-glmnet( y =as.factor(ifelse(train$cliks <1,0,1)), x =as.factor(sparseYY[,40]), + family="binomial" ) Błąd wif (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns") : – Marcin Mar 24 '15 at 11:38
  • @MarcinKosinski without a reproducible example, I'm afraid I can't help out any further. Perhaps you could try constructing a full dataset prior to passing it to glmnet? – Roman Luštrik Mar 25 '15 at 08:01