I am trying to do a univariate logistic regression analysis. The input is a data frame with 1 response variable, some demographics (age, gender and ethnicity) and >100 predictor variables. In order to analyse it I have been using:
#Function
proc_glm <- function(predictors) {
univariate <- glm(Data$Outcome ~ predictors, family = binomial)
return(cbind(coef(summary(univariate)),OR = exp(coef(univariate)), exp(confint(univariate))))
}
#Call Function
glm_output <- lapply(Data[5:150], proc_glm)
This works completely fine on the overall database. I then subsetted the data based on ethnicity, which I did using:
Data1 <- subset(Data,Ethnicity==0)
No obvious issue; "Data 1" has fewer rows than "Data" but the same number of variables. There is no missing data.
I then tried to run the same analysis as before, replacing Data1 for Data in both places but I get the following error:
Error in cbind(coef(summary(univariate)), OR = exp(coef(univariate)), : number of rows of matrices must match (see arg 3)
I'm not sure what I've changed which causes the error. I'm working on R Studio - Version 1.2.1335
Data looks like this:
Data <-cbind(
data.frame(
Age=sample(20:80,50),
Gender=sample(0:1,size=50,replace=TRUE),
Ethnicity=sample(0:2,size=50,replace=TRUE),
Outcome=sample(0:1,size=50,replace=TRUE)
),
data.frame(replicate(100,sample(0:2,50,rep=TRUE)))
)