How do I get the within-group association using lme4 in r?

Question

Setup: I'm testing if the association between pairs of individuals for a trait (BMI) changes over time. I have repeated measures, where each individual in a pair gives BMI data at 7 points in time. Below is a simplified data frame in long format with Pair ID (the identifier given to each pair of individuals), BMI measurements for both individuals at each point in time (BMI_1 and BMI_2), and a time variable with seven intervals, coded as continuous.

Pair_ID	BMI_1	BMI_2	Time
1	25	22	1
1	23	24	2
1	22	31	3
1	20	27	4
1	30	26	5
1	31	21	6
1	19	18	7
2	21	17	1
2	22	27	2
2	24	22	3
2	25	20	4

First, I'm mainly interested in testing the within-pair association (the regression coefficient of BMI_2, below) and whether it changes over time (the interaction between BMI_2 and Time). I'd like to exclude any between-pair effects, so that I'm only testing associated over time within pairs.

I was planning on fitting a linear mixed model of the form:

    lmer(BMI_1 ~ BMI_2 * Time + (BMI_2 | Pair_ID), Data)

I understand the parameters of the model (e.g., random slopes/intercepts), and that the BMI_2 * Time interaction tests whether the relationship between BMI_1 and BMI_2 is moderated by time.

However, I'm unsure how to identify the (mean) within-pair regression coefficients, and whether my approach is even suitable for this.

Second, I'm interested in understanding whether there is variation between pairs in the BMI_2 * Time interaction (i.e., the variance in slopes among pairs) - for example, does the associated between BMI_1 and BMI_2 increase over time in some pairs but not others?

For this, I was considering fitting a model like this:

    lmer(BMI_1 ~ BMI_2 * Time + (BMI_2 : Time | Pair_ID), Data)

and then looking at the variance in the BMI_2 : Time random effect. As I understand it, large variance would imply that this interaction effect varied a lot between pairs.

Any help on these questions (especially the first question) would be greatly appreciated.

P.s., sorry if the question is poorly formatted. It's my first attempt.

What kind of "whithin-pair estimates" are you looking for? What information about the model do you want to get from them? To help you with your second question, we need to know what BMI_1 and _2 are: Is BMI_2 the lagged version of BMI_1 (e.g. time 2 entered in same row as time 1)? — benimwolfspelz, Sep 19 '21 at 11:52
Hi Benim, thanks for your response. I've updated the question to clarify these issues, including by adding a data table. By "within-pair estimates" I'm talking about the (mean) within-pair association (or correlation) between BMI_2 and BMI_1. For the second question, BMI_1 and BMI_2 are BMI measurements from the two individuals in a pair, taken at the same time. — Tom, Sep 21 '21 at 09:52
I see. So what is oftentimes recommended for (longitudinal) multilevel regressions is to split your level1 (measurements) variables into level1 and level2 (persons/couples) variance: So for BMI_2 make a variable that is centered around the per-pair mean (such that it has only level1 variance) and a second variable with those means (has only level2 variance). Use both predictors in the same model. You will get separate estimates for within- and between couple association with your dependent variable. You can also include a random slope for your level1-predictor and/or interactions with time. — benimwolfspelz, Sep 21 '21 at 10:10

score 1 · Accepted Answer · answered Oct 07 '21 at 20:40

1

Answering for completeness. @benimwolfspelz's comment is spot on. This is known as "contextual effects" in some areas of applied work. The idea is to split the variable into between and within components by mean-centering each group and fitting the mean-centred variable (which will estimate the within component) and the group means (which will estimate the between component).

answered Oct 07 '21 at 20:40

Robert Long

5,722
5
29
50

Thanks for your response, Robert. I have a few follow up questions if you're happy to help: 1) In this scenario, is it necessary to centre the dependent variable? 2) I was originally running standard linear models on standardised variables (z-scores) to render the coefficients as partial correlation coefficients (to help with interpretation). If I used the multilevel approach on z-scores, would the coefficients also be interpreted as partial correlation coefficients? 3) Lastly, would it be reasonable to add the within-group component as a random slope in a random slopes/intercepts model? – Tom Oct 13 '21 at 13:01
1

You're welcome. No, you don't centre the DV, only the predictos that you want to split into between/within. You can still do this with standardised variables. And yes it would be reasonable to use the within variable as random slopes - provided that such a random structure is suported by the data. – Robert Long Oct 13 '21 at 13:34
Thanks a lot Robert! – Tom Oct 14 '21 at 14:22

Pair_ID	BMI_1	BMI_2	Time
1	25	22	1
1	23	24	2
1	22	31	3
1	20	27	4
1	30	26	5
1	31	21	6
1	19	18	7
2	21	17	1
2	22	27	2
2	24	22	3
2	25	20	4

Pair_ID	BMI_1	BMI_2	Time
1	25	22	1
1	23	24	2
1	22	31	3
1	20	27	4
1	30	26	5
1	31	21	6
1	19	18	7
2	21	17	1
2	22	27	2
2	24	22	3
2	25	20	4

How do I get the within-group association using lme4 in r?

1 Answers1

Pair_ID	BMI_1	BMI_2	Time
1	25	22	1
1	23	24	2
1	22	31	3
1	20	27	4
1	30	26	5
1	31	21	6
1	19	18	7
2	21	17	1
2	22	27	2
2	24	22	3
2	25	20	4