I have a panel dataset:
data <- data.table(ID = c(1,1,1,1,2,2,3,3,3),
year = c(1,2,3,4,1,2,1,2,3),
score1 = c(90,78,92,69,86,73,82,85,91))
> data
ID year score1
1: 1 1 90
2: 1 2 78
3: 1 3 92
4: 1 4 69
5: 2 1 86
6: 2 2 73
7: 3 1 82
8: 3 2 85
9: 3 3 91
I want to place a constraint such that every ID
should have at least 3 years of observations. I tried using the following using the data.table
package:
data[data$year >= 3, ]
However this ignores the observations in years 1 & 2 for ID
's which have at least 3 years of observations. In other words, I want to only look at ID's which have at least 3 years of observations and also include the year 1 & 2 observations.
The expected output is thus:
ID year score1
1: 1 1 90
2: 1 2 78
3: 1 3 92
4: 1 4 69
5: 3 1 82
6: 3 2 85
7: 3 3 91