I'm trying to take Lasso regression but I can't define well my X and Y in R.
#load data
>test.data<-read.spss("C:\\Users\\Inhib\\OneDrive\\documents\\dummy.sav",use.value.labels=TRUE, to.data.frame=TRUE)
>test.data #testing my data, it's all there so I won't add it here
#take columns 2 to 6 for X
>X<-as.matrix(test.data[,2:6])
# Column 1 is the predicted variable Y
>Y<-as.matrix(test.data[,1])
#Ok, let's fit it
fit<- glmnet(x, y, family="gaussian", alpha=0, lambda=0.001)
then I get this error message:
"Error in glmnet(x, y, family = "gaussian", alpha = 0, lambda = 0.001) :
number of observations in y (100) not equal to the number of rows of x (222)"
Now all columns are of the same length (222) but the error says that there are only 100 in Y and 222 in X.
#So I checked for Y here
>length(Y)
[1] 222
#Then checked for X
> length(X)
[1] 1110
Now, it certainly points to that I miss something. Obviously the matrix size is different and X is 222*5 columns, but how can I make it work? Meaning that this error will be gone...I tried many ways. I'm working on that for hours, that really stop me from progressing and there is not much help from google. Would be grateful for solution.