1

I have a dataset which I would like to run a significance test based on the year. A sample of the dataset is as follows:

df = structure(list(Index = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
                            13, 14, 15, 16), Year = c(1990, 1990, 1990, 1991, 1991, 1990, 
                            1990, 1991, 1991, 1992, 1992, 1990, 1990, 1991, 1991, 1992), 
                             Pet = c("Fish", "Fish", "Fish", "Fish", "Fish", "Cat", "Cat", 
                            "Cat", "Cat", "Cat", "Cat", "Dog", "Dog", "Dog", "Dog", "Dog"
                            ), Price = c(0.5, 0.55, 0.6, 0.65, 0.7, 5, 6, 7, 8, 8, 9, 
                             6, 6.5, 8, 8, 10)), class = c("tbl_df", "tbl", "data.frame"
                            ), row.names = c(NA, -16L)) 

I am currently using the summarise function in dplyr to get the average but I would like to run a significance test at the same time across the years (t-test for 2 years and anova for 3 or more years).

Ideally the output would be the following:

Pet 1990 1991 1992 P-Value from significance test
Cat 5.5 7.5 8.5 xx (anova)
Dog 6.25 8 10 xx (anova)
Fish 0.55 0.675 xx (t-test)

My code is currently as such and I'm not sure how to add in the significance test column:

df %>% group_by(Year, Pet) %>%
       summarise(price = mean(Price)) %>% 
       pivot_wider(names_from = Year, values_from = price)

Appreciate your help and thanks in advance!

Luther_Proton
  • 348
  • 1
  • 7
  • Have you seen `rstatix`? https://cran.r-project.org/web/packages/rstatix/index.html They have a tidyverse friendly t-test. – NicChr May 10 '23 at 15:13
  • You have the mean right now but not the test. First, you need to figure out how to do the test you want. Second, you need to figure out how to add them. I'm guessing you're trying to do `lm()` or `anova()`. – socialscientist May 12 '23 at 00:48

0 Answers0