-1

I am trying to analyse Lobster egg development between the months of January and June by measuring the eyespot to gather the perkins eye index (PEI).

I have a dataset with two variables, Month, ranging from January and June, and PEI, ranging from 1 to ~600. Typically PEI is around 300-600, however the eyespot hadn't devloped for some eggs so they were assigned a value of 1. I have a range of 40 to 240 PEI values for each month and my data set looks like this...

enter image description here

I am using the mgcv package and i am trying to fit a GAM.

My issue is i keep getting this error after i attempt to fit the GAM:

Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : 
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning messages:
1: In mean.default(xx) : argument is not numeric or logical: returning NA
2: In Ops.factor(xx, shift[i]) : ‘-’ not meaningful for factors

I ensured that Month is a factor, that there are 6 levels in the correct order, and i have checked for missing values, and now i am not sure what is wrong.

Please advise on whether my dataset is fit for the GAM process or explain how i should do this differently, thank you**

The code i used is as follows....

library(mgcv)

data <- read.csv("PEI.csv")

Month <- c("January", "February", "March", "April", "May", "June")

Month <- factor(Month, levels = c("January", "February", "March", "April", "May", "June"))

gam <- gam(PEI ~ s(Month), data = data)

I then get the error....

Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : 
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning messages:
1: In mean.default(xx) : argument is not numeric or logical: returning NA
2: In Ops.factor(xx, shift[i]) : ‘-’ not meaningful for factors
Phil
  • 7,287
  • 3
  • 36
  • 66
Laurie
  • 1
  • What do you want to know/what is the goal of this analysis? A slightly more general/stats-oriented question might be appropriate for https://stats.stackexchange.com ... – Ben Bolker Jul 18 '23 at 14:33
  • I want to know whether there is a statistically significant difference between the PEI values for each month, as this would indicate a closing of the spawning period of Lobsters in my study. Id expect this difference to be between the months of May and June. I've been advised a GAM is the way forward. I'll head to stack exchange as well. Cheers – Laurie Jul 18 '23 at 14:47
  • Advised by whom? It seems overkill to me. I'd take this to [CrossValidated](https://stats.stackexchange.com/) (for the record, I'd suggest treating month as a categorical variable with successive-differences contrasts ...) – Ben Bolker Jul 18 '23 at 18:56
  • Okay will do, cheers mate! – Laurie Jul 19 '23 at 12:01

1 Answers1

1

If you define months <- c(January = 1, February = 2, March = 3, April =4, May = 5, June = 6), you can run

data$monthNo <- months[data$Month]
gam <- gam(formula = PEI ~ s(monthNo), data = data)
backboned
  • 11
  • 2
  • Thanks mate, that's moved things along a bit! I now have the error: Error in smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : A term has fewer unique covariate combinations than specified maximum degrees of freedom. Do you recommend i lower the degrees of freedom? – Laurie Jul 18 '23 at 14:20
  • If you have only 6 distinct predictor values, fitting a GAM is barely worth it, but yes, you could lower the df (must be < 6, or maybe just <= 6?) – Ben Bolker Jul 18 '23 at 14:28
  • Cheers @BenBolker - could you recommend some code for changing the degrees of freedom? I've tried > gam <- gam(PEI ~ s(MonthNo, df = 6), data = data) but i get the response Error in terms.formula(reformulate(term[i])) : invalid model formula in ExtractVars – Laurie Jul 18 '23 at 14:42
  • I am not really familiar with gam, but I don't think that df is a valid argument. Maybe you'll find help("choose.k") helpful – backboned Jul 18 '23 at 15:30
  • Cheers mate! @backboned – Laurie Jul 19 '23 at 12:01