My data frame looks like this:
category calss test1 test2
1 Yes 5.5 4.2
1 No 5.8 4.3
1 Yes 6.6 3.2
2 Yes 6 7.7
2 No 5.7 5.8
3 No 9.7 4.5
3 Yes 6.8 8.5
2 No 6.3 9.6
3 Yes 8.5 2.6
I want to calculate the mean, SD, and p values (between test1 and test2) base on class and category respectively.
I used dplyr
to calculate mean and SD and now I am struggling to calculate the p value, as my dataset contains 1000 lines, 4 different categories, and 8 classes.
Here is what I get after using dplyr
for the mean and sd:
category class test1_Mean test1_SD test2_Mean test2_SD
1 Yes 6 1 3.7 1.1
1 No 5.8 0 4.3 0
2 Yes 9.6 0 4.4 0
2 No 6 1.1 7.7 1
3 Yes 7.6 0.5 5.5 0.8
3 No 9.7 0 4.5 0
The output I want is:
category class test1_Mean test1_SD test2_Mean test2_SD Pvalue
1 Yes 6 1 3.7 1.1 0.05
1 No 5.8 0 4.3 0 0.14
2 Yes 9.6 0 4.4 0 0.69
2 No 6 1.1 7.7 1 0.001
3 Yes 7.6 0.5 5.5 0.8 2.00E+05
3 No 9.7 0 4.5 0 0.04
Thanks in advance.