-2

I am working on a large matrix with number of samples N=40 and features, P=7130. I am trying to fit the cv.glmnet() for the ridge but i am getting error while doing this.
The dimensions of the dataset is (40,7130)
The code for the cv.glmnet() is as follows:

ridge2_cv <- cv.glmnet(x, y,
                   ## type.measure: loss to use for cross-validation.
                   type.measure = "deviance",
                   ## K = 10 is the default.
                   nfold = 10,
                   ## Multinomial regression
                   family = "multinomial",
                   ## ‘alpha = 1’ is the lasso penalty, and ‘alpha = 0’ the ridge penalty.
                   alpha = 0)

Here x is large matrix with 285160 elements. y is the multi-class response variable of size 40
I keep getting this error when i run the above function.

Error in cbind2(1, newx) %*% (nbeta[[i]]) : invalid class 'NA' to dup_mMatrix_as_dgeMatrix In addition: Warning messages: 1: In lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : one multinomial or binomial class has fewer than 8 observations; dangerous ground 2: In lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : one multinomial or binomial class has fewer than 8 observations; dangerous ground

botloggy
  • 383
  • 2
  • 15
  • can you edit your question to show us `str(x)` and `str(y)` (and maybe `table(y)`) ? – Ben Bolker Oct 22 '18 at 23:21
  • @BenBolker I figured out the problem when i checked the `typeof(x)` and `typeof(y)`. The dataframe was read as character and i had to use `read.table` and change according to my problem. This solved the problem. Thanks for your suggestion. It gave me idea to solve – botloggy Oct 23 '18 at 03:01

1 Answers1

0

Can you try with data.matrix() for the matrix instead of as.matrix? I remember trying out something similar.

ridge2_cv <- cv.glmnet(data.matrix(x), y,
               ## type.measure: loss to use for cross-validation.
               type.measure = "deviance",
               ## K = 10 is the default.
               nfold = 10,
               ## Multinomial regression
               family = "multinomial",
               ## ‘alpha = 1’ is the lasso penalty, and ‘alpha = 0’ the ridge penalty.
               alpha = 0)
Ashok KS
  • 659
  • 5
  • 21
  • Yes your answer is right in some cases, but my issue was with compatibility. The data was being read as character when i transposed the matrix. I changed it to read as double and it worked. Thanks for your suggestion – botloggy Oct 23 '18 at 03:04