0

I am learning to use the excellent "expss" R package.

I need to know if it is possible to use this package to make a contingency table between a multiple choice variable and a categorical variable, considering a weight variable

The categorical variable is "sex" in this dataframe, and the weight variable is "survey_weight":

demo <- tribble(
~dummy1, ~dummy2, ~dummy3, ~survey_weight, ~sex,
      1,       0,       0,          1.5,  "male",
      1,       1,       0,          1.5,  "female",
      1,       1,       1,           .5,  "female",
      0,       1,       1,          1.5,  "male",
      1,       1,       1,           .5,  "male",
      0,       0,       1,           .5,  "male",
)
demo 

I need to calculate the percentage based on total respondents who answered the question, and not on total responses.

Thanks in advance!

Sebastian
  • 95
  • 6
  • Can you show the expected output. Do you need `demo %>% group_by(sex) %>% summarise_at(vars(starts_with('dummy')), ~ weighted.mean(., wt = survey_weight))` – akrun Feb 06 '20 at 20:33
  • If you can provide more details about which function to use from `expss`, it would be easier to understand. Without that it is a bit unclear as to what you expect – akrun Feb 06 '20 at 21:06

2 Answers2

1

May be we can use the cro_cpct

library(expss)
calculate(demo, cro_cpct(list(dummy1, dummy2, dummy3), weight = survey_weight, sex))
#                                 
# |              |    sex |      |
# |              | female | male |
# | ------------ | ------ | ---- |
# |            0 |        | 50.0 |
# |            1 |    100 | 50.0 |
# | #Total cases |      2 |  4.0 |
# |            0 |        | 50.0 |
# |            1 |    100 | 50.0 |
# | #Total cases |      2 |  4.0 |
# |            0 |     75 | 37.5 |
# |            1 |     25 | 62.5 |
# | #Total cases |      2 |  4.0 |
akrun
  • 874,273
  • 37
  • 540
  • 662
1
library(expss)
demo = text_to_columns('
 dummy1   dummy2   dummy3  survey_weight  sex
      1        0        0            1.5  male
      1        1        0            1.5  female
      1        1        1             .5  female
      0        1        1            1.5  male
      1        1        1             .5  male
      0        0        1             .5  male
')


demo %>% 
    tab_cells(mdset(dummy1 %to% dummy3)) %>%  # 'mdset' designate that with have multiple dichotomy set
    tab_cols(sex) %>%  # columns
    tab_weight(survey_weight) %>% # weight
    tab_stat_cpct() %>% # statistic
    tab_pivot() 

# |              |    sex |      |
# |              | female | male |
# | ------------ | ------ | ---- |
# |       dummy1 |    100 | 50.0 |
# |       dummy2 |    100 | 50.0 |
# |       dummy3 |     25 | 62.5 |
# | #Total cases |      2 |  4.0 |

# shorter notation with the same result
calc_cro_cpct(demo, mdset(dummy1 %to% dummy3), sex, weight = survey_weight)
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20