-1

I am trying to run a 3-way ANOVA in R, but my values for each variable are in one column and not separated by rows. Currently, my data frame looks something like this:

Season  Site    Location    Replicate   Lengths
Jan_16  MI      Adj        1.00      ,
Jan_16  MI      Adj        2.00      ,
Jan_16  MI      Adj        3.00      ,
Jan_16  MI     Away        1.00      3,4,
Jan_16  MI     Away        2.00      ,
Jan_16  MI     Away        3.00      ,
Jan_16  MP     Adj         1.00      4,5,6,5,4,5,4,4,4,4,5,4,6,4,
Jan_16  MP     Adj         2.00      4,4,3,3,5,4,3,4,5,3,4,3,4,3,4,6,
Jan_16  MP     Adj         3.00      4,6,5,5,4,
Jan_16  MP     Away        1.00      ,4,4,10,4,5,4,6,5,5,
Jan_16  MP     Away        2.00       3,4,4,4,5,5,4,5,
Jan_16  MP     Away        3.00       4,4,13,4,

Lengths is the response variable that I wish to run the ANOVA on, how would I do this? Just a "," means there is no data.

**** EDIT

I have tried separate rows

library(tidyr)

separate_rows(data.frame, Season:Replicate, Lengths, convert=numeric )


#Error: All nested columns must have the same number of elements

The Lengths have a different number of variables, so is there a way to unnest this?

miken32
  • 42,008
  • 16
  • 111
  • 154
  • Think through what you need your data to look like in order to do ANOVA. I'm guessing you want to split the items in `Lengths` so that each row will have a single value. `tidyr::separate_rows` is one function that can do this. It would be best if you can work out how you get started writing code and add that to the post – camille Aug 14 '18 at 15:43

2 Answers2

0

Unnesting the data was the best solution to the problem.

Running the code:

library(dplyr)

#Unnest everything so that no longer "," but each has a row


data.frame.new<-data.frame   
  transform(Lengths=strsplit(Lengths,",")) %>%
  unnest(Lengths)

#Gets rid of blanks where there are no data

Set.unnest<-subset(data.frame.new, Lengths!="")   

This gives the result of repeated rows for the Season, Site, Location, and Replicate for each data point in Lengths

0

It's not clear from your question what your independent variables are. In the following example, I assume Site, Location, Replicate are your IVs.

Let's first split entries in Lengths into different rows, and remove rows with missing/no Lengths.

library(tidyverse)
df.aov <- df %>%
    mutate(Lengths = str_split(Lengths, ",")) %>%
    unnest() %>%
    filter(Lengths >= 0)

We can now perform a 3-way ANOVA with aov

res <- aov(Lengths ~ Site * Location * Replicate, data = df.aov)
res
#Call:
#   aov(formula = Lengths ~ Site * Location * Replicate, data = df.aov)
#
#Terms:
#                     Site  Location Replicate Location:Replicate Residuals
#Sum of Squares    2.21675   7.61905   0.11491            0.89526 131.58506
#Deg. of Freedom         1         1         1                  1        53
#
#Residual standard error: 1.57567
#3 out of 8 effects not estimable
#Estimated effects may be unbalanced

Note that results are not very sensible. I assume that your actual dataset is larger.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68