0

I have a set of polynomial equations that I have developed to examine the relationship between mercury concentrations in fish by year while taking into account the effect fish length has on mercury concentrations. The years examined are groups into 1996/97, 2006/7, 2009/10, 2017/18 and have been treated as factors withing the equation (i.e. dummy variables where 1= present, 0= absent when solving for the year). The model is:

lm(formula = mercury ~ length + (length)^2 + years + (years * length) + (years * ((length)^2)))

I am looking for a way to extract the formula with the coefficients to easily estimate mercury concentrations for each year while using a standardized length value. I have >20 polynomial equations to do this for so I am looking for an easily reproducible way to do this rather than manually typing out the model and indexing the coefficients.

Does anyone know of an alternative way to extract the polynomial equation WITH the coefficients produced by the model so that I can easily plug in a standardized value for length and produce a mercury concentration for each year in the equation?

I have created the polynomial equations and have solved some of the equations with only two years of data (i.e. only two dummy variables) manually and know that the equations are correct, but am not sure how to do this efficiently for the sets of data that have 3 or 4 year factors.

    m1<-lm(formula = mercury ~ length + (length)^2 + years + (years * length) 
    + (years * ((length)^2)))
    summary(m1)

    Call:
    lm(formula = mercury ~ length + (length)^2 + years + (years * length) + 
    (years * ((length)^2)))

    Residuals:
         Min       1Q   Median       3Q      Max 
    -0.22277 -0.08006 -0.00239  0.05862  0.39678 

    Coefficients:
                                                             Estimate Std. Error
    (Intercept)                                            -0.92197    0.04442
    length                                                 1.09499    0.30794
    2006/2007                                              0.20758    0.05128
    length:2006/2007                                      -0.75779    0.39036

                                                          t value Pr(>|t|)    

    (Intercept)                                           -20.755  < 2e-16 
    length                                                 3.556 0.001197  
    2006/2007                                              4.048 0.000306 
    length: 2006/2007                                      -1.941 0.061081  
    ---

Residual standard error: 0.1331 on 32 degrees of freedom Multiple R-squared: 0.48, Adjusted R-squared: 0.4313 F-statistic: 9.847 on 3 and 32 DF, p-value: 9.425e-05

In the code above, the output I want is:

mercury ~ 1.09499(length) + (1.09499(length))^2 + 0.20758(years) + (0.20758(years) * 1.09499(length)) + (0.20758(years) * ((1.09499(length))^2)-0.92197

Using this I want to be able to plug in a standard value for length to make the statement 'at a standardized length of 12 cm, mercury concentrations in 1996/97 are estimated to be # and mercury concentrations in 2006/7 are estimated to be # with a confidence interval of #.'

Similarly, I would like to be able to make the same equations and statements about more complex relationships were the polynomial equation contains 4 different 'year' inputs.

arun v
  • 852
  • 7
  • 19
  • 2
    To square `length` in a `lm` formula you must use `I(length^2)`. – Rui Barradas Aug 23 '19 at 14:29
  • Or, instead of including each and every polynomial term, take a look at `?poly`. For instance, `poly(length, 2, raw = TRUE)`. – Rui Barradas Aug 23 '19 at 14:35
  • 1
    Instead of trying to reproducing the equation, take a look at the `predict` function. `predict` takes the model and a data frame of the variable's values and generates the predictions. – Dave2e Aug 23 '19 at 14:40

0 Answers0