I created a regression model And i want to estimate an influence analysis for each factor.
Meaning take the square Wald-estimation (z-value) for a specific factor and divide it by sum of squares of their Wald-estimation. and that how I estimate the influence of specific factor.
My problem is that the factors are divided by their levels.
I will give an example:
model<-glm(formula = form,
family = binomial("logit"),
data = Train)
View(summary(model)$coefficients)
In the table we can see that the factor dom_time_Colnames
is divided into 4 levels. Same thing happened with first_byte_downdload_Colnames
.
I want to take the factors z-values and not their levels z-values.
How I do it? anova()
is a good idea but it doesn't stop running for me. I search for creative solution that give me output like the the z-value in glm summary or deviance in anova for the all factor and not for theirs levels.
Here is a reproducible example:
Data<-data.frame(Species=iris$Species)
for(i in 1:ncol(iris)){
if(is.numeric(iris[,i])){
result=quantile(x = iris[,i],probs = seq(0,1,0.1))
out<-cut(iris[,i], breaks = unique(result),include.lowest = TRUE)
Data<-data.frame(Data,out)
colnames(Data)[length(Data)]<-colnames(iris)[i]
} else {
next()
}
}
Data$y<-rbinom(n = nrow(Data),size = 1,prob = 0.1)
form<-formula(y~.)
model<-glm(formula = form,
family = binomial("logit"),
data = Data)
View(summary(model)$coefficients)
we can see that the factor sepal or Petal is divided to its levels.