Say we are confined to using SAS and have a panel/longitudinal dataset. We have indicators for cohort and time, as well as some measured variable y
.
data in;
input cohort time y;
datalines;
1 1 100
1 2 101
1 3 102
1 4 103
1 5 104
1 6 105
2 2 .
2 3 .
2 4 .
2 5 .
2 6 .
3 3 .
3 4 .
3 5 .
3 6 .
4 4 108
4 5 110
4 6 112
run;
Note that units of cohort and time are the same so that if the dataset goes out to time unit 6, each successive panel unit will be one period shorter than the one before it in time.
We have a gap of two panel units between actual data. The goal is to linearly interpolate the two missing panel units (values for cohort 2 and 3) from the two that "sandwich" them. For cohort 2 at time 5 the interpolated value should be 0.67*104 + 0.33*110
, while for cohort 3 at time 5 it would be 0.33*104 + 0.67*110
. Basically you just weight 2/3 for the closer panel unit with actuals, and 1/3 for the further panel unit. You'll of course have missing values, but for this toy example that's not a problem.
I'm imagining the solution involves lagging and using the first.
operator and loops but my SAS is so poor I hesitate to provide even my broken code example.