2

I have the following dataset on I would like to perform chi-squared test, since I am curious whether is there any significant difference between the number of males and females with different genotypes. I've tried several solution (I'm not writing it down because they were written for an other form of this dataset) but it did not work. I was adviced to input my data as the followings.

genotype males females
1    blm_wt    14       6
2   blm_het    33      11
3  blm_wt_2     9       7
4 blm_het_2    36      10

My code looks like this right now:

library(tidyverse)
library(ggplot2)
library(ggpubr)
library(ggsci)
library(ggthemes)
library(ggExtra)
blm_sex_chi = read.csv("D:/Krisztiann/Antropologia/R_programozas/zebrahal/blm_sex_ratios_table_chi.csv", header = TRUE) 
#with this I'm reading the csv file which looks like as it was shown above
chi.res = chisq.test(blm_sex_chi)

I'm getting this error message: Error in chisq.test(blm_sex_chi) : all entries of 'x' must be nonnegative and finite.

I don't know why. My dataset looks like this guy's in this video (https://www.youtube.com/watch?v=POiHEJqmiC0) but it does not work. Maybe the problem is that mine was not transformed to contingency table. I tried but the function just messed up the table.

I'm very beginner in R, do you have any for this problem?

Thx in advance!

selender14
  • 21
  • 1
  • Maybe chi.res = chisq.test(blm_sex_chi[ ,2:3]) – Dave2e Jan 17 '22 at 21:57
  • thx for the answer! Yeah sg working with this but I'm not sure what was calculated at the background. I think it summarizing the whole dataset and for me maybe comparing to genotypes would be the ideal calculation. I tried to compare blm_wt and blm_het in an online calculator (https://www.socscistatistics.com/tests/chisquare2/default2.aspx) and it worked, but in R I don't know how to do that. I've tried chi.res = chisq.test(blm_sex_chi[1,2:3], blm_sex_chi[2,2:3]) and I got "Error in complete.cases(x, y) : not all arguments have the same length" – selender14 Jan 17 '22 at 22:16
  • I think you need to clarify what your statistical hypothesizes are. Do you want each line compared or the entire table. If it each line independently then that is not a chi square test. – Dave2e Jan 17 '22 at 23:34
  • yes, I think you are right. I will try both concept! – selender14 Jan 18 '22 at 10:45

1 Answers1

3

Bring your dataframe in a format of contingency table:

For this remove the existing rownames (1,2,3,4) by using as_tibble and add the column genotype as rownames:

library(dplyr)
library(tibble)
df1 <- df %>% 
  as_tibble() %>% 
  column_to_rownames("genotype")

chisq <- chisq.test(df1)
chisq
    Pearson's Chi-squared test

data:  df1
X-squared = 3.1052, df = 3, p-value = 0.3757
TarJae
  • 72,363
  • 6
  • 19
  • 66