I have a dataframe like this
df <- structure(list(ID = c(243, 292, 317, 388, 398, 404, 463, 473,
842, 844, 858, 862, 869, 871, 879, 888), Zone = c(1, 1, 1, 1,
1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2), Gen = c("Male", "Male",
"Other Gender Identity", "Male", "Male", "Male", "Male", "Female",
"Female", "Male", "Female", "Male", "Male", "Male", "Male", "Female"
), Month_Inc = c("< $1,500", "< $1,500", "< $1,500", "$1,500 - $1,999",
"$1,500 - $1,999", "< $1,500", "< $1,500", "< $1,500", "$1,500 - $1,999",
"$2,000 - $2,499", "$1,500 - $1,999", "< $1,500", "$2,500 - $2,999",
"< $1,500", "< $1,500", "< $1,500")), row.names = c(NA, -16L), class = c("tbl_df",
"tbl", "data.frame"))
What I need to do is to test if there is a statistical difference for the percentage of females in the two zones. I need to test this for the income level too.
I need to do a t-test for Gen~Zone
Ho = %female=%male for the two zones
H1 = %female != %male for the two zones
Similarly, for the Month_Inc ~ Zone
too!
I tried the following code
t.test(Gen ~ Zone, mu = 0, alt = "two.sided",
conf= 0.95, paired = FALSE, ver.equal = FALSE,
data= df)
however, I am not getting anywhere! How do I correct it? I am thinking of something to do with the data type issue but I can't be certain.
Thanks for your help!