Structure data for 2x2x2 ANOVA in R

Question

I have the following dataset, of which a subset is:

structure(list(Sex = c("Male", "Male", "Female", "Male", "Male", "Male"
), AgeGroup = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Young", 
"Old"), class = "factor"), FPT_Values_LEFT = c(0.615962446, 
NA, 0.466983199, 0.607728601, 0.534221292, 0.577650726), FPT_Values_RIGHT = c(0.534368277, 
NA, 0.456424206, 0.589604795, 0.564736903, 0.484459132)), row.names = c(NA, 
6L), class = "data.frame")

So:

     Sex AgeGroup FPT_Values_LEFT FPT_Values_RIGHT
1   Male      Old       0.6159624        0.5343683
2   Male      Old              NA               NA
3 Female      Old       0.4669832        0.4564242
4   Male      Old       0.6077286        0.5896048
5   Male      Old       0.5342213        0.5647369
6   Male      Old       0.5776507        0.4844591

I would like to run an ANOVA of AgeGroup*SIDE*value but need to get the dataset to include a side column that corresponds to either FPT_Values_LEFT or FPT_Value_RIGHT.

I have used varitions of melt but its not accurately lining up by side with the values.

DF$SIDE <- c("Left", "Right") 

reshape::melt(id="SIDE")
or
reshape::melt(id=c("SIDE", "AgeGroup"))

reshape::melt(id=c("SIDE", "AgeGroup"), measure.vars = c("FPT_Value_LEFT", "FPT_Value_RIGHT"))

I know I can transpose and append the data to move the right values under left, and copy the sex and age columns to match, then add a side column. I am not sure if there is a more efficient method though.

Expected output:

   Sex AgeGroup FPT_Values         SIDE
1   Male      Old       0.6159624  Left
2   Male      Old              NA  Left
3 Female      Old       0.4669832  Left
4   Male      Old       0.6077286  Left
5   Male      Old       0.5342213  Left
6   Male      Old       0.5776507  Left
1   Male      Old     0.5343683    Right
2   Male      Old            NA    Right
3 Female      Old     0.4564242    Right
4   Male      Old     0.5896048    Right
5   Male      Old     0.5647369    Right
6   Male      Old     0.4844591    Right

I would have thought that you would just want `melt(DF, id = c("Sex", "AgeGroup"))`. If you needed a column with the words "left" and "right", you could extract that from the resulting "variable" column. — A5C1D2H2I1M1N2O1R2T1, Dec 10 '20 at 04:06
@A5C1D2H2I1M1N2O1R2T1 I am not sure what you mean, for the anova I need to ensure it accounts for side — CanyonView, Dec 10 '20 at 04:16

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2020-12-10T04:26:07.437

Using data.table you can do:

> library(data.table)
> melt(as.data.table(DF), id = c("Sex", "AgeGroup"))[, variable := sub("FPT_Values_", "", variable)][]
       Sex AgeGroup variable     value
 1:   Male      Old     LEFT 0.6159624
 2:   Male      Old     LEFT        NA
 3: Female      Old     LEFT 0.4669832
 4:   Male      Old     LEFT 0.6077286
 5:   Male      Old     LEFT 0.5342213
 6:   Male      Old     LEFT 0.5776507
 7:   Male      Old    RIGHT 0.5343683
 8:   Male      Old    RIGHT        NA
 9: Female      Old    RIGHT 0.4564242
10:   Male      Old    RIGHT 0.5896048
11:   Male      Old    RIGHT 0.5647369
12:   Male      Old    RIGHT 0.4844591

If you want to continue using melt from "reshape2" rather than using data.table, you can achieve the same by doing:

transform(reshape2::melt(DF, c("Sex", "AgeGroup")), variable = sub("FPT_Values_", "", variable))

However, I would recommend updating to data.table or converting your code to using tidyr.

If I had two variables will this still work though? like FPT_Values_LEFT and ABC_Values_LEFT and FPT_Values_RIGHT and ABC_Values_RIGHT — CanyonView, Dec 10 '20 at 04:49
Yes. You'd have different options on how you could arrange your data. For instance, you could have the four columns (but twice as many rows) as above, but with a fifth column indicating whether it's an FPT value or an ABC value. Or you can have a six-column output (Sex, AgeGroup, FPT_LR, FPT_Val, ABC_LR, ABC_Val). — A5C1D2H2I1M1N2O1R2T1, Dec 10 '20 at 04:56

score 1 · Answer 2 · answered Dec 10 '20 at 04:20

Using tidyr::pivot_longer :

tidyr::pivot_longer(df, 
                    cols = starts_with('FPT_Values'), 
                    names_to = c('.value', 'SIDE'), 
                    names_pattern = '(.*)_(\\w+)')

#    Sex    AgeGroup SIDE  FPT_Values
#   <chr>  <fct>    <chr>      <dbl>
# 1 Male   Old      LEFT       0.616
# 2 Male   Old      RIGHT      0.534
# 3 Male   Old      LEFT      NA    
# 4 Male   Old      RIGHT     NA    
# 5 Female Old      LEFT       0.467
# 6 Female Old      RIGHT      0.456
# 7 Male   Old      LEFT       0.608
# 8 Male   Old      RIGHT      0.590
# 9 Male   Old      LEFT       0.534
#10 Male   Old      RIGHT      0.565
#11 Male   Old      LEFT       0.578
#12 Male   Old      RIGHT      0.484

Structure data for 2x2x2 ANOVA in R

2 Answers2