2

I want to run a Wald test to evaluate the statistical significance of each coefficient in the model using the regTermTest function of the survey package (as described here).

The syntax of regTermTest calls for the model followed by the test.terms, but if you list multiple test terms it seems to evaluate them all together rather than separately.

library(caret) # for the GermanCredit sample dataset
data(GermanCredit)
mod1 <- glm(Class ~ Age + as.factor(ForeignWorker) + Property.RealEstate + Housing.Own + CreditHistory.Critical, data = GermanCredit, family = binomial(link='logit'))
library(survey)
regTermTest(mod1, c("Age", "ForeignWorker", "Property.RealEstate", "Housing.Own", "CreditHistory.Critical"))
#

Of course, I could separate them out this way, but it's clunky and repetitive (i.e. the following code produces the desired result but is inefficient when dealing with lots of variables):

regTermTest(mod1, "Age")
regTermTest(mod1, "ForeignWorker")
regTermTest(mod1, "Property.RealEstate")
regTermTest(mod1, "Housing.Own")
regTermTest(mod1, "CreditHistory.Critical")

I've tried extracting the coefficient names into a vector and inserting it into a for loop, but it didn't work (it combines all the terms into one evaluation rather than separately estimating their importance):

vars <- names(mod1$coefficients)
vars <- vars[-1]
for (i in 1:length(vars)) {
     iv = vars[i]
     rtest <- regTermTest(mod1, iv)
}

How can I efficiently code this?

coip
  • 1,312
  • 16
  • 30
  • *it didn't work*...we often get this unhelpful remark. What was the error or undesired result? – Parfait Feb 20 '18 at 21:24
  • The undesired result is that it combines all the terms into one evaluation rather than separately estimating their importance. I've edited the question to add this information. – coip Feb 20 '18 at 21:35
  • Yes, saving the results in a list or other object would be desirable. – coip Feb 20 '18 at 21:42
  • I have edited the question to clarify this. The line-by-line code was the desired result--it's easy to do, but poor programming as it's repetitive. I'm looking for a better solution than copying and pasting code and changing variable names manually. – coip Feb 20 '18 at 21:58

1 Answers1

2

(Updated)

The *apply family can help, depending on how you want things to look.

lapply(names(mod1$model)[-1], function(x) regTermTest(mod1, x))

sapply(names(mod1$model)[-1], function(x) regTermTest(mod1, x))

You'll have a bit of work to do if you wanted to display the results in a nice way.

(Explanation of update).

The original solution just followed the questioner's idea to use names(mod1$coefficients). But that won't work if there is a factor variable, since mod1$coefficients will contain the name(s) of the variable concatenated with non-default values in the way R regression models always deal with categorical variables. That confuses regTermTest because it goes looking for a variable in the dataset that doesn't exist so it returns a baffling error message.

ngm
  • 2,539
  • 8
  • 18
  • Thanks for this. It worked on the sample data but gave an error on my real data: `Error in solve.default(V) : 'a' is 0-diml`. I realized this was because my actual data contained factors but the sample data did not. I've updated the sample data set above with this dilemma. I apologize, but do you know any tweaks to your code that would accept regression formulas with factor variables (as per the updated sample data above)? – coip Feb 20 '18 at 22:01
  • 1
    What's supposed to happen for factor variables? I do not believe wald stats can run on categorical variables. Consider then a `tryCatch` to return NAs for those problem elements: `sapply(names(mod1$coefficients)[-1], function(x) tryCatch(regTermTest(mod1, x), error = function(e) NA))` – Parfait Feb 20 '18 at 22:55
  • I don't know why but `regTermTest` on a factor variable on my real data works fine: `regTermTest(mod1, "ForeignWorker")` where `ForeignWorker` is a factor. However, I cannot get it to work with the sample data above when I factor `ForeignWorker` there. I see someone else has had the same `Error in solve.default(V) : 'a' is 0-diml` error when using `regTermTest`, but didn't seem to find a solution: https://stackoverflow.com/questions/40216353/r-error-in-solve-defaultv-a-is-0-diml-in-regtermtest-function – coip Feb 20 '18 at 23:58
  • 1
    `regTermTest` works fine with factor variables. Where it gets confused is when you give it a variable name that isn't in the dataset. See my edited answer for an explanation of what is going on. – ngm Feb 21 '18 at 15:02