I'm trying to perform z-tests in R for equality of proportions for data with several groups. Following the bottom-most suggestion on this SO post, I have been attempting to use map2
from purrr
to compare the proportion of cases between males and females separately for each condition and for children and adults. That is, I want to compare the proportion of cases between boys and girls with condition A, between men and women with condition A, between boys and girls with condition B, etc.
In the sample data, cond_total
is the number of cases for each combination of (male x adult x condition), and total
is the total number of people for each combination of (male x adult x condition).
library(tidyverse)
library(broom)
library(magrittr)
# sample data
male <- factor(c(0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1))
adult <- factor(rep(0:1, 10))
condition <- factor(rep(c("A", "B", "C", "D", "E"), each = 4))
cond_total <- c(sample(25:40, 20, replace = T))
total <- c(sample(50:200, 20, replace = T))
df <- data.frame(male, adult, condition, cond_total, total)
When I directly follow the SO example linked above, the function runs, but I'm clearly not correctly incorporating the comparison structure that I want, because I get a single test of proportions for each unique combination of variables:
df %>% mutate(zvalue = map2(cond_total, total, ~ prop.test(.x, .y) %>% tidy())) %>% unnest(zvalue)
The same thing happens if I try to use group_by
to indicate that each (adult x condition) group should be compared between the two gender groups.
df %>% group_by(male) %>% mutate(zvalue = map2(cond_total, total, ~ prop.test(.x, .y) %>% tidy())) %>% unnest(zvalue)
If I try to use nest()
to introduce the grouping structure (which I think I'm doing correctly, but perhaps not), I get two different error messages. On my home computer, I get "Error: column zvalue must be length 10 (the number of rows) or one, not 20." On my work computer, I get "Error: object cond_total not found."
df %>% nest(-c(adult, condition)) %>% mutate(zvalue = map2(cond_total, total, ~ prop.test(.x, .y) %>% tidy())) %>% unnest(zvalue)
I have also looked at this SO post and am trying to figure out if my problem is exactly the same - the complexity of the solution described there was very surprising to me, and I'm not sure if it's what I need.
Session info from my home computer:
R version 3.6.1 (2019-07-05) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.16
Session info from my work computer:
R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17763)