1

I am having a coding issue when trying to create an interaction plot of fixed-effects(Model 1) Two-Way ANOVA data.I typed and imported my data from excel into RStudio. The data was as follows.

Sex Genotype    Activity
Female  I   2.838
Female  I   4.216
Female  I   2.889
Female  I   4.198
Female  II  3.55
Female  II  4.556
Female  II  3.087
Female  II  1.943
Female  III 3.62
Female  III 3.079
Female  III 3.586
Female  III 1.943
Male    I   1.884
Male    I   2.283
Male    I   2.939
Male    I   1.486
Male    II  2.396
Male    II  2.956
Male    II  3.105
Male    II  2.649
Male    III 2.801
Male    III 3.421
Male    III 2.275
Male    III 2.11

I then saved it as an Excel Workbook. I then in RStudio went File>Import Dataset>Excel>Selected my dataset file>Selected First Row As Names & Open Data Viewer. I checked my data and it was all under the three headers in three columns. Then to do my ANOVA I did:

Model_1 <- aov(Activity ~ Sex*Genotype, data=data7)
> summary(Model_1)
         Df Sum Sq Mean Sq F value Pr(>F)  
Sex           1  3.527   3.527   6.563 0.0196 *
Genotype      2  0.178   0.089   0.165 0.8488  
Sex:Genotype  2  1.166   0.583   1.085 0.3591  
Residuals    18  9.673   0.537   

Next, I tried to make an interaction plot:

interaction.plot(Genotype,Sex,Activity, fun = mean, type= c("b"), xlab= 
"Genotype" ,ylab = "Enzyme Activity (enzyme unit (U)=1µmol min-
1)",main="Interaction Plot" ) 

But, when I do so I get this error:

 Error in tapply(response, list(x.factor, trace.factor), fun) : object 'Genotype' not found 

How do I make Genotype and Sex an object? Shouldn't they already be objects as they show up in the ANOVA table?

Also, I want to make a table of the Mean,SD, and n for each cell, but when I tried to do so using what worked for a one-way ANOVA it did not work for Two-way. How do you make such a table?

I have tried following examples online, but none of them have helped me with the getting the interaction plot to be created, and I have not seen any good examples (even on this site) on how to make a table of mean, SD, and n for Two-way ANOVA that have worked for my situation. Is there any package that I can use to help me?

If anyone can help explain what I did wrong and how to make the table I would greatly appreciate it.

N.J
  • 21
  • 2
  • 7
  • read up `?interaction.plot`, specifically the examples. basically you need to let it know where the data is, as you will see in your command, you didn't mention it. try this `with(data7, { interaction.plot(Genotype,Sex,Activity, fun = mean, type= c("b"), xlab= "Genotype" ,ylab = "Enzyme Activity (enzyme unit (U)=1µmol min- 1)",main="Interaction Plot" ) })` and for your table look-up `?summary` – infominer Nov 15 '17 at 23:18
  • Thank you for your help I was able to make an interaction plot with: `with(dataHW7, { interaction.plot(Genotype,Sex,Activity, fun = mean, type= c("b"),pch=16, xlab= "Genotype" ,ylab = "Enzyme Activity (enzyme unit (U)=1µmol min- 1)",main="Interaction Plot",ylim=c(0,4) ) })`. I am still having issues getting a table of Mean,SD, and n. After looking at `?summary` and this site I was able to get the Mean for each cell by ` model.tables(Model_1, type = 'means')` ; However, I have not been able to figure out how to get the SD for each cell. – N.J Nov 16 '17 at 15:51
  • Do you have anymore advice on how to proceed or an example I should look at? – N.J Nov 16 '17 at 15:57
  • I am pretty sure I want to use the tapply function, but I have tried `tapply(Activity,Sex,sd)` and `tapply(Activity,Genotype,sd)` and I tried `with(dataHW7,{tapply(Activity,Sex,sd)})` but that gave me the entire rows sd. I want by cell not by the entire column or row. – N.J Nov 16 '17 at 17:11

2 Answers2

1

Two ways to do it. Using base functions

#nj is your dataframe which I loaded using
nj <-read.table("nj_data.txt", header = T)
#Base R - using built-in functions
# the Dot indicates variable(column) to work on
# if you have more than one variable use their names
#good practice to use na.rm = T, to make sure you exclude
# NA - missing values, other wise mean and sd will report NA
aggregate(. ~ Sex+Genotype, data = nj,
          FUN = function(x) c(Mean = mean(x, na.rm = T),
                              n = length(x),
                              sd = sd(x, na.rm = T)))

#using dplyr - I suggest this as syntax is nicer
library(dplyr)
nj %>% group_by(Sex, Genotype) %>%
summarise_all(funs(mean(., na.rm = T),sd(., na.rm = T),n()))
#here the "." in mean and sd means using the data given 
#equivalent to x in base R above

Read-up Tidyverse ( a new bunch of libraries to make data analysis easier) at https://www.tidyverse.org/

and also ggplot - here's a nice primer on interaction plot using ggplot https://sebastiansauer.github.io/vis_interaction_effects/

infominer
  • 1,981
  • 13
  • 17
  • Thank you so much! You explained it so well and I have looked at the links and they look like they will be useful in the future. I was not aware of why `na.rm=T` was so important. I just want to check I understand what the` "."` symbolizes. So for the `aggregate` the dot stands for the variable/column Activity, so if I had two columns I might not use the dot and instead do `Activity+Surviorship~Sex+Genotype` ? – N.J Nov 17 '17 at 16:46
  • You're welcome, glad to know it helped you. Well in the spirit of learning, I would recommend you try what you typed and as a hint I would point to the example in `?aggregate`, especially the one with `cbind` in formula specification. This is when it will start get confusing and your future self/others will be left scracthing their heads, hence I suggest adapting to dplyr version, cleaner syntax aiding in better readability – infominer Nov 17 '17 at 22:33
0

Here's base solution, perhaps kludgy but effective:

data7$group=paste(data7$Sex,data7$Genotype,sep="_")

tapply(data7$Activity,data7$group,sd)