0

I have a training dataset (95 840 rows) with:

str(train)
$ NUM_DEVICE_ID_COUPON       : Factor w/ 9 levels "8647","8666",..: 3 4 5 8 9 1 2 3 4 6 ...
$ TEMPERATURE_AIR            : num  6.29 6.13 6 7.05 8.16 ...
$ MonthNumber                : Factor w/ 12 levels "1","2","3","4",..: 10 10 10 10 10 10 10 10 
$ HOURS                      : Factor w/ 24 levels "0","1","2","3",..: 7 7 7 7 7 8 8 8 8 8 ...
$ TEMPERATURE_COUPON         : num  5.1 6.6 4.5 5.4 4.7 ...

Thanks to a Linear Model

lm(TEMPERATURE_COUPON ~ TEMPERATURE_AIR * MonthNumber * HOURS, ...)

the best model (based on BIC) is gotten with this above interaction.

So now I want to improve my best model (reduce BIC) by studying random effects of NUM_DEVICE_ID_COUPON.

First is it a good idea to start from fixed effects with the previous interaction?

But I have no idea to study which random effects: for the intercept, for each individual covariable TEMPERATURE_AIR, MonthNumber and HOURS? Plots of each covariable in function of NUM_DEVICE_ID_COUPON will help me?

library(lme4)
reg_ml1 = lmer(TEMPERATURE_COUPON ~ TEMPERATURE_AIR * MonthNumber * HOURS + 
(1 + TEMPERATURE_AIR | NUM_DEVICE_ID_COUPON) + ( 1 + TEMPERATURE_AIR | NUM_DEVICE_ID_COUPON) + ....)

What's the strategy?

Thanks for your help.

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Theo75
  • 477
  • 4
  • 14
  • this belongs on [CrossValidated](https://stats.stackexchange.com). Quick questions and comments: (1) what is your goal in modeling (e.g. prediction or inference)? (2) I would generally recommend **top-down** construction of a mixed effect model, i.e. start with the **maximal model** (all random effects that can be estimated, i.e. allow for variation across groups of the effect of all covariates that vary *within* groups [otherwise the across-group variation in their effect can't be identified] ...) - aim for most complex non-singular model *or* best AIC/BIC. – Ben Bolker Mar 09 '22 at 01:34
  • Hi thanks. I also posted the question on CrossValidated without answer. My goal is for prediction. When you say trying the maximal model : is it this one? lmerTest::lmer(TEMPERATURE_COUPON ~ TEMPERATURE_AIR * MonthNumber * HOURS + (1 + TEMPERATURE_AIR | NUM_DEVICE_ID_COUPON) + (1 + MonthNumber | NUM_DEVICE_ID_COUPON) + (1 + HOURS | NUM_DEVICE_ID_COUPON) ; Thanks – Theo75 Mar 09 '22 at 09:16

0 Answers0