0

I have some problems concerning the svytest function of the survey package. My goal is to perform a simple t-test of the hypothesis: Lower-educated people have greater concerns over the immigration of immigrants. The dependent variable is coded as a binary variable (0|1), and the independent as a numerical value (1 = lowest possible education, 4 = highest possible education).

Columns: 8
$ ID               <dbl> 57385, 157633, 169289, 172583, ~
$ weights          <dbl> 0.274958, 0.110605, 0.090035, 0~
$ state            <chr> "Bayern", "Bayern", "Rheinland-~
$ region           <fct> West, West, West, West, East, W~
$ education        <dbl> 3, 4, 4, 3, 4, 3, 4, 4, 2, 1, 3~
$ opinion_refugee  <dbl> 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0~
$ number_clusters  <dbl> 16, 16, 16, 16, 16, 16, 16, 16,~
$ total_resp_state <int> 192, 192, 45, 29, 36, 189, 45, ~

I created a svydesign-object as usual and run the following code:

test <- svyttest(opinion_refugee ~ education, refugee_data_obj)
print('Estimate')
test$estimate
print('Confidence Interval')
test$conf.int
print('P Value')
test$p.value

But I receive the following error:

Error in svyttest.default(opinion_refugee ~ education, refugee_data_obj) : group must be binary
Bacher
  • 1
  • 2
  • looking at `?svyttest` i think you want `education ~ opinion_refugee` ? – Anthony Damico Dec 16 '21 at 10:15
  • This time the function worked, however the new formula doesn't correspond to my stated hypothesis. I'd like to explain in which way the opinion towards refugees is influenced by the respondents education. Your formula explains the causal relationship in the opposite direction. – Bacher Dec 16 '21 at 11:21
  • not sure it's a simple t-test if you have multiple groups on the left hand side of your equation? are you looking for `?svychisq` or another test of association documented in the R survey package's help pages? – Anthony Damico Dec 18 '21 at 11:46

1 Answers1

0

Yes, this is a problem with the svyttest() function. I can only use a binary variable as a grouping variable.

As a solution I could do both, either recode the variable as a binary variable or use other types of regression tests.

Within the survey package the svyglm() function can be usefull. For other instances, you can for example specify the family as gaussian, if your dependent has multiple numeric values or as binomial when it is binary.

Peter Csala
  • 17,736
  • 16
  • 35
  • 75
Bacher
  • 1
  • 2