1

I am analysing data for an observational study on trout growth. So there are plenty of NAs in the dataframe. Also, different treatments have different numbers of observations. So I think this can be called an unbalanced design?

Year    Site    FishID  Size    Cover   AGE L1
2010    LT1 10_LT1 _ 11 Large   Heavy   2   5.88
2010    LT3 10_LT3 _ 14 Large   Heavy   2   5.228571429
2010    SO4 10_SO4 _ 8  Small   Open    0   NA
2010    SO5 10_SO5 _ 22 Small   Open    0   NA
2011    LT1 11_LT1 _ 14 Large   Heavy   1   6.44
2011    LT1 11_LT1 _ 15 Large   Heavy   1   6.25
2011    LT1 11_LT1 _ 16 Large   Heavy   1   6.421052632
2011    LT1 11_LT1 _ 18 Large   Heavy   1   7.74
2011    SO5 11_SO5 _ 6  Small   Open    1   7.7625
2011    SO5 11_SO5 _ 8  Small   Open    1   6.914285714
2011    SO5 11_SO5 _ 13 Small   Open    1   6.5
2011    SO5 11_SO5 _ 16 Small   Open    1   7.2
2011    ST1 11_ST1 _ 21 Small   Heavy   0   NA
2011    ST2 11_ST2 _ 10 Small   Heavy   0   NA
2011    ST2 11_ST2 _ 5  Small   Heavy   0   NA
2011    ST3 11_ST3 _ 20 Small   Heavy   0   NA
2011    ST4 11_ST4 _ 5  Small   Heavy   0   NA
2011    ST1 11_ST1 _ 9  Small   Heavy   1   7.521428571
2011    ST1 11_ST1 _ 17 Small   Heavy   1   8.169230769
2011    ST1 11_ST1 _ 20 Small   Heavy   1   7.03125

My fixed effects are stream size: 2 levels stream cover: 2 levels, Year: 2 levels and age: 3 levels.

I also ramdomly chose a bunch of sites to examine stream size and stream cover effects. So I presume site is classed as a random effect?

L1 is a proxy for growth rate.

At this stage my maximal model looks like this:

m1=lme(L1~Year*Age*Size*Cover, random=~1|Site ,data=Trout_Growth,method="ML",na.action=na.exclude)

qqnorm(residuals(m1))
qqline(residuals(m1)) # Normality OK

plot(density(residuals(m1),na.rm=T))   ***## heteroscedasticity present (lme should handle this??) ##***

summary(m1)
anova(m1,test=T,type="marginal")    ***## marginal used because of unbalanced design?? ##***

I have of course had a good look at this website which is great but I am still unsure of correct notation for inclusion of the random factor Site. (lme and lmer seem to use different notation. Is this correct?)

I Suppose I am asking if this is the correct starting point to refine my model. I am starting to really enjoy R but without regular thumbs up regarding the R code, the temptation is to slip back to comfy windows based stats software.

Any opinions or advice would be greatly appreciated.

Diarm

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
Diarmuid Ryan
  • 71
  • 1
  • 5
  • 2
    While you may get help without it, it's better if you have an example that will run. That takes sample data. It is common and appropriate to use a built-in data set. – Matthew Lundberg Apr 23 '13 at 13:44
  • @ Matthew. Thanks for the advice. Not quite sure how to do this so I will cut and paste a few sample rows of my dataframe. If this is bad practice, please let me know. – Diarmuid Ryan Apr 23 '13 at 14:02
  • 1
    It certainly is a good starting point, but a risky one, because depending on the data getting all interaction will result in a somewhat unreadable summary, or even lme will refuse to work because of to few data. Try to replace all * with + first, and then slowly add pairwise interaction that make sense with :, not with *. When you decide that no longer can interpret the contrast matrix in word, you are probably close to what makes sense. – Dieter Menne Apr 23 '13 at 14:59
  • Thanks for the input.I am just thinking that Site may be nested in Size and Cover. Meaning each individual site in the study is associated with only a particular level of each treatment. I am trying to work out the structure but most examples seem to work with hierarchical designs. Any thoughts on the nesting structure? – Diarmuid Ryan Apr 25 '13 at 10:11
  • Perhaps you can describe the experimental design somewhat better? are streams identical to sites? Also describing what you want to get to know may be helpfull: are you realy interested in the effect of year or rather across years? As a random effect i think i would use something like 'random=~1|Site/FishID' – thijs van den bergh Dec 13 '13 at 16:47

0 Answers0