0

Working example from https://glmnet.stanford.edu/articles/glmnet.html#multinomial-regression-family-multinomial-:

data(MultinomialExample)
x <- MultinomialExample$x
y <- MultinomialExample$y
cvfit <- cv.glmnet(x, y, family = "multinomial")

cvfit then returns:

Call:  cv.glmnet(x = x, y = y, family = "multinomial") 

Measure: Multinomial Deviance 

      Lambda Index Measure      SE Nonzero
min 0.009791    35   1.409 0.05611      14
1se 0.022619    26   1.455 0.04132       9

Now I would like to extract the 9 Nonzero coefficients. However:

sum(matrix(coef(cvfit)$'1') != 0)
sum(matrix(coef(cvfit)$'2') != 0)
sum(matrix(coef(cvfit)$'3') != 0)

returns 10, 12, 10. Moreover:

length(Reduce(intersect, 
              list(rownames(coef(cvfit)$'1')[matrix(coef(cvfit)$'1') != 0], 
                   rownames(coef(cvfit)$'2')[matrix(coef(cvfit)$'2') != 0], 
                   rownames(coef(cvfit)$'3')[matrix(coef(cvfit)$'3') != 0])))

returns 1.

What does Nonzero = 9 represent for the lambda.1se and how can I recover these variable names and corresponding coefficients when type.multinomial = "ungrouped" (default cv.glmnet() setting when family = "multinomial") ?

Thanks

user328349
  • 11
  • 2

2 Answers2

1
set.seed(1839)
X <- replicate(10, rnorm(100))
y <- X[, 1] + rnorm(100)

library(glmnet)
mod <- cv.glmnet(X, y)
as.data.frame(as.matrix(coef(mod$glmnet.fit, s = mod$lambda.1se)))

Returns:

                   s1
(Intercept) 0.1072691
V1          0.6947232
V2          0.0000000
V3          0.0000000
V4          0.0000000
V5          0.0000000
V6          0.0000000
V7          0.0000000
V8          0.0000000
V9          0.0000000
V10         0.0000000
Mark White
  • 1,228
  • 2
  • 10
  • 25
  • Hi Mark, Thanks for your reply. I was not specific enough in the original question. This case is unique to multinomial classification when the type.multinomial argument is set to 'ungrouped' (default setting is 'grouped'). – user328349 Jul 19 '21 at 17:52
0

From what I can tell the calculation is probably the following:

  • Calculate the number of non-zero coefficients per group (not including the intercept)
  • Report the minimum result across groups (excluding any groups where the result is zero)

In your case two of the groups have 9 non-zero coefficients excluding the intercept and the other 10, so 9 is reported.

One way to be sure would be to check the code - but I haven't done that.