I'm using a longitudinal survey in long format, and I'm trying to create a dummy variable for if an individual has NOT got a college degree by the age of 25. My data looks something like this:
ID CYRB VAR VALUE
1 1983 DEG98 1
1 1983 DEG00 1
1 1983 DEG02 1
1 1983 DEG04 0
2 1979 DEG08 0
2 1979 DEG00 0
2 1979 DEG02 1
2 1979 DEG04 1
3 1978 DEG98 NA
3 1978 DEG00 NA
3 1978 DEG02 NA
3 1978 DEG04 0
As I've tried to illustrate, there are quite a few missing data points for survey responses in the relevant years. But clearly if the respondent responds no in later years it can be inferred that they didn't have a degree when they were <25 either.
Trying to be as general as possible, how can I create a new variable that depends on all the variable values of just one individual, i.e. for ID = 1, 2, 3 etc.?
Sorry if I'm not clear!
Edit:
Sorry my fault, the data used to be in wide format and the variables denote whether the respondent has a college degree in 1998, 2000, 2002 etc. (with value denoting the response 1 == TRUE, 0 == FALSE), CYRB is indeed year of birth, the table edited for the expected output of my desired dummy variable would be:
ID CYRB VAR VALUE DUMMY
1 1983 DEG98 0 0
1 1983 DEG00 0 0
1 1983 DEG02 0 0
1 1983 DEG04 1 0
2 1979 DEG08 0 0
2 1979 DEG00 0 0
2 1979 DEG02 1 0
2 1979 DEG04 1 0
3 1978 DEG98 NA 1
3 1978 DEG00 NA 1
3 1978 DEG02 NA 1
3 1978 DEG04 0 1
i.e. if the respondent replies in any survey from the age of 25 onwards that he/she does not have a college degree the dummy takes the value of 1.
Hope this is a bit clearer.