I am trying to find the weighted cross tabulations of several columns. My actual data has over 500 columns so I am trying to automate the process as much as possible.
This is a snippet of my data:
data_in <- read_table2("Q50_1 Q50_2 Q38 Q90 pov gender wgt
1 3 Yes 2 High M 1.3
2 4 No 2 Med F 0.4
4 2 Yes 4 Low F 1.2
3 3 No 2 High M 0.5
1 2 No 4 High M 0.7
2 2 Yes 3 Low F 0.56
4 4 Yes 2 Med F 0.9
")
What I have tried:
I could use the library(pollster) package to find the weighted frequencies one at a time like this. I'm using pivot longer here because I need each of these to look similar so that I rbind them into a single dataset.
crosstab(df = data_in, pov, Q50_1, wgt,format = "long") %>% pivot_longer(cols = starts_with("Q"))
crosstab(df = data_in, pov, Q38, wgt, format = "long") %>% pivot_longer(cols = starts_with("Q"))
crosstab(df = data_in, gender, Q50_1, wgt,format = "long")%>% pivot_longer(cols = starts_with("Q"))
crosstab(df = data_in, gender, Q50_2, wgt,format = "long")%>% pivot_longer(cols = starts_with("Q"))
This is my desired output:
data_out <- read_table2("group pct n question response group_level
High 80 2.5 Q50_1 1 pov
High 20 2.5 Q50_1 3 pov
Low 31.8 1.76 Q50_1 2 pov
Low 68.2 1.76 Q50_1 4 pov
Med 30.8 1.3 Q50_1 2 pov
Med 69.2 1.3 Q50_1 4 pov
High 48 2.5 Q38 No pov
High 52 2.5 Q38 Yes pov
Low 100 1.76 Q38 Yes pov
Med 30.8 1.3 Q38 No pov
Med 69.2 1.3 Q38 Yes pov
F 31.4 3.06 Q50_1 2 gender
F 68.6 3.06 Q50_1 4 gender
M 80 2.5 Q50_1 1 gender
M 20 2.5 Q50_1 3 gender
")
More of what I have tried:
I had the idea of creating some kind of function, that loops through two vectors. The first vector contains each of the X values that will go into cross tabs and the second will contain each of the Y values that will feed into cross tabs. This is what I have so far. I have been tweaking this for some time now to no avail.
vect <- c("gender", "pov")
vect2 <- c("Q50_1","Q38")
func2 <- function(.data, vector, vector2) {
.data <- data_in
wgt <- .data$wgt
for (i in 1:length(vector))
for (j in 1:length(vector2)) {
out <- crosstab(df = .data, vector[i], vector2[j], wgt,format = "long")
}
out
}
func2(vect, vect2)
Help with:
- Is this a good approach to achieve my desired output? If so, why do I keep getting an error? :(
- I would like to add the functionality to this so that I create single data set with all my crosstabs (as shown in the desired output).
Any suggestions appreciated! Thank you for your help!!