I'm currently using the "weightloss" dataset from the datarium package to start running an RMANOVA. Here is the dput:
dput(head(weightloss))
structure(list(id = structure(1:6, .Label = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12"), class = "factor"),
diet = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("no",
"yes"), class = "factor"), exercises = structure(c(1L, 1L,
1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = "factor"),
t1 = c(10.43, 11.59, 11.35, 11.12, 9.5, 9.5), t2 = c(13.21,
10.66, 11.12, 9.5, 9.73, 12.74), t3 = c(11.59, 13.21, 11.35,
11.12, 12.28, 10.43)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
So this is the script I have come up with so far:
# Create Data Frame for Dataset:
weight <- weightloss
weight
# Pivot Longer Data to Create Factors and Scores:
weight <- weight %>%
pivot_longer(names_to = 'trial', # creates factor (x)
values_to = 'value', # creates value (y)
cols = t1:t3) # finds which cols to factor
# Plot Means in Boxplot:
ggplot(weight,
aes(x=trial,y=value))+
geom_boxplot()+
labs(title = "Trial Means") # As can be predicted, inc w/time
I get this pretty normal looking boxplot:
Now its time to find outliers and test for normality.
# Identify Outliers (Should be None Given Boxplot):
outlier <- weight %>%
group_by(trial) %>%
identify_outliers(value)
outlier_frame <- data.frame(outlier)
outlier_frame # none found :)
# Normality (Shapiro-Wilk and QQPlot):
model <- lm(value~trial,
data = weight) # creates model
shapiro_test(residuals(model)) # measures Shapiro
ggqqplot(residuals(model))+
labs(title = "QQ Plot of Residuals") # creates QQ
This again gives me a pretty normal QQplot:
I then wrapped the data by trial:
ggqqplot(weight, "value", ggtheme = theme_bw())+
facet_wrap(~trial)+
labs(title = "QQPlot of Each Trial") #looks normal
And it comes out right from what I can tell:
However, when I try to do a Shapiro Wilk test by group, I keep having issues with this code:
shapiro_group <- weight %>%
group_by(trial) %>%
shapiro_test(value)
It gives me this error:
Error: Problem with
mutate()
columndata
. idata = map(.data$data, .f, ...)
. x Must group by variables found in.data
.
- Column
variable
is not found.
I also tried this:
shapiro_test(weight, trial$value)
And get this error instead:
Error: Can't subset columns that don't exist. x Column
trial$value
doesn't exist.
If anybody has some insight as to why, I would greatly appreciate it!