I want to run a simple multivariate logistic regression. I made an example below with binary data to talk through an example.
multivariate regression = trying to predict 2+ outcome variables
> y = matrix(c(0,0,0,1,1,1,1,1,1,0,0,0), nrow=6,ncol=2)
> x = matrix(c(1,0,0,0,0,0,1,1,0,0,0,0,1,1,1,0,0,0,1,1,1,1,0,0,1,1,1,1,1,0,1,1,1,1,1,1), nrow=6,ncol=6)
> x
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 1 1 1 1
[2,] 0 1 1 1 1 1
[3,] 0 0 1 1 1 1
[4,] 0 0 0 1 1 1
[5,] 0 0 0 0 1 1
[6,] 0 0 0 0 0 1
> y
[,1] [,2]
[1,] 0 1
[2,] 0 1
[3,] 0 1
[4,] 1 0
[5,] 1 0
[6,] 1 0
So, variable "x" has 6 samples and each sample has 6 attributes. Variable "y" has 2 predictions for each of the 6 samples. I specifically want to work with binary data.
> fit = glm(y~x-1, family = binomial(logit))
I do "-1" to eliminate the intercept coefficient. Everything else is standard logistic regression in a multivariate situation.
> fit
Call: glm(formula = y ~ x - 1, family = binomial(logit))
Coefficients:
data1 data2 data3 data4 data5 data6
0.00 0.00 -49.13 0.00 0.00 24.57
Degrees of Freedom: 6 Total (i.e. Null); 0 Residual
Null Deviance: 8.318
Residual Deviance: 2.572e-10 AIC: 12
At this point things are starting to look off. I am not sure why the internet for data 3 and 6 is what it is.
val <- predict(fit,data.frame(c(1,1,1,1,1,1)), type = "response")
> val
1 2 3 4 5 6
2.143345e-11 2.143345e-11 2.143345e-11 1.000000e+00 1.000000e+00 1.000000e+00
Logically I am doing something wrong. I am expecting a 1x2 matrix , not 1x6. I want matrix that tells me the probability of data frame vector being a "1"(true) in y1 and y2.
Any help would be appreciated.
Note : I updated the ending of my question based on reply from Mario.