I have binary data depending on whether an individual pass/failed a test, as well as characteristic information (e.g. gender) and what department they belonged to (e.g. x,y,z) in df(data)
head(data,9)
department gender pass
x Male 1
y Female 1
y Male 0
y Male 1
x Female 1
z Female 0
z Male 1
x Male 0
z Female 0
I can easily run chi-square tests on relationship between gender and passing with:
chisq.test(data$gender, data$pass)
But is there a way that this can be run separately for values in 'department' (x,y,z) without having to manually subset the data each time?
I can create a new dataframe that breaks down the overall pass rate for each department using tapply:
as.data.frame(tapply(data$pass, data$department,mean))
But is there a way i can add a new variable which indicates the result of the test outlined above (let's say p-value)?