I have a data frame that looks similar to this:
case_id | pol_demo | pol_demo_online | pol_post_online | pol_petition | pol_cntct_polit | pol_party | pol_other | pol_demo_illegal |
---|---|---|---|---|---|---|---|---|
1311 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
97 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5480 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2531 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
2291 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
2064 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
The case_id corresponds to a unique id of a survey respondent, and it has as many unique values as the number of rows in the data frame. The columns are forms of behavior, that the respective respondent did (1), or did not do (0). There are more columns than on the picture (17), but you get the idea.
I would like to have this transformed into an adjacency matrix. That is, I would like to have a matrix with 17 rows and 17 columns (one for each behavior), and the cells should be the number of times different forms of behavior are combined (the sum for each pair). So using the columns in the example the matrix would look like:
pol_demo | pol_demo_online | pol_post_online | pol_petition | pol_cntct_polit | pol_party | pol_other | pol_demo_illegal | |
---|---|---|---|---|---|---|---|---|
pol_demo | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_demo_online | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_post_online | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_petition | 3 | 3 | 3 | 4 | 3 | 3 | 3 | 2 |
pol_cntct_polit | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_party | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_other | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 |
pol_demo_illegal | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
Meaning pol_demo is 3 times combined with pol_demo_online, but only 2 times with pol_demo_illegal, etc.
I have a hunch that I need to work with outer
, but I could not figure it out. I would be happy for a tidy solution, but really, any help is much appreciated!
This is a snippet of the data:
dat <- structure(list(pol_demo = c(1, 0, 0, 1, 0, 1), pol_demo_online = c(1,
0, 0, 1, 0, 1), pol_post_online = c(1, 0, 0, 1, 0, 1), pol_petition = c(1,
0, 0, 1, 1, 1), pol_cntct_polit = c(1, 0, 0, 1, 0, 1), pol_party = c(1,
0, 0, 1, 0, 1), pol_other = c(1, 0, 0, 1, 0, 1), pol_demo_illegal = c(0,
0, 0, 1, 0, 1), help_shopping = c(1, 1, 0, 1, 1, 1), help_childcare = c(0,
1, 0, 1, 0, 1), help_general = c(1, 1, 0, 1, 1, 1), help_emo = c(1,
1, 0, 1, 1, 1), help_symb = c(1, 1, 0, 1, 0, 1), help_fin = c(1,
0, 0, 1, 1, 1), help_donation = c(1, 0, 0, 1, 1, 1), help_volunteer = c(1,
0, 0, 1, 0, 1), help_other = c(1, 1, 0, 1, 0, 1), case_id = structure(c(1311,
97, 548, 2531, 2291, 2064), label = "numerical, unique ID per respondent", format.stata = "%9.0g")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))