ANOVA test for three means

Question

I am trying use ANOVA test for this question: practice

I've created three vectors and I am wondering how to run ANOVA for all 3. If I write

first = c(30000, 34000, 36000, 38000, 40000)
third = c(30000, 35000, 37000, 38000, 40000)
fifth = c(40000, 41000, 43000, 44000, 50000)
anova(lm(first~third~fifth))

I get error. To compare all 3 maybe I should have:

anova(lm(first~third+fifth))

When I run this I get an answer but it's not the correct one... F-value should be 6.834 and P-value = 0.01044

I think you'd better read about linear modelling before asking this programming question. You are trying to explain the `first` dependent variable with `third` and `fifth`predictive variable. This is a non-sense, you will perfrom an ANOVA to know if price takes different values depending on the year. in other words you first need to fit a linear model with `price` as the dependent variable and `year`as the predictive variable — Basti, Nov 22 '21 at 12:23

tpetzoldt · Answer 1 · 2021-11-25T22:40:45.047

If you want to compare 3 data sets, you have to organize this in form of a two column data frame, e.g. the dependend variable (y) and the grouping variable (group). Two vectors are also possible, but a data frame has the advantage that you can easily see the relationship. In addition, it is a good idea to encode the grouping variable as a factor. More can be found in the statistics and R textbooks.

first <- c(30000, 34000, 36000, 38000, 40000)
third <- c(30000, 35000, 37000, 38000, 40000)
fifth <- c(40000, 41000, 43000, 44000, 50000)

# organize the data and the grouping variable as a data frame
mydata <- data.frame(
  y = c(first, third, fifth),
  group = factor(rep(c("first", "third", "fifth"), each=5))
)

## show structure of the data
mydata

## fit linear model and perform anova
m <- lm(y ~ group, data=mydata)
anova(m)

## don't forget diagnostics
par(mfrow=c(2, 2))
plot(m)

The result of anova(m)is then indeed:

> anova(m)
Analysis of Variance Table

Response: y
          Df    Sum Sq   Mean Sq F value  Pr(>F)  
group      2 203200000 101600000  6.8341 0.01044 *
Residuals 12 178400000  14866667                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

ANOVA test for three means

1 Answers1