I have a longitudinal dataset with ID, Wave (Wave1-4), and Score. Here's sample data with the same structure. The length of the original data is around 2000, with 500 participants total, put in long form.
ID Wave Score
1 1001 1 28
2 1001 2 27
3 1001 3 28
4 1001 4 26
5 1002 1 30
6 1002 3 30
7 1003 1 30
8 1003 2 30
9 1003 3 29
10 1003 4 28
11 1004 1 22
12 1005 1 20
13 1005 2 18
14 1006 1 22
15 1006 2 23
16 1006 3 25
17 1006 4 19
I would like to select the 'ID's with all four measurements of 'Score' available. In other words, I want to select rows of the participants with 'Score' available for all 4 waves. I've been trying to select rows with 'ID's that have data in all 'Wave's. My tryout so far has been based on this idea: if a participant has all four measurements, the ID will appear in the data four times. That's why I tried to count the number of IDs,
table(data$id) == 4
and although it showed me the number of each ID appearing in the data, I cannot select the corresponding rows.
all.data <- subset(data, subset=table(data$id) == 4)
Because the length of the original data is different, being in long form. "Length of logical index must be 1 or 2637, not 828" I would need a long-form data for further analysis, so I wish not to change it.