I currently have a data that looks like this for multiple ids (that range until around 1600)
id year name status
1 1980 James 3
1 1981 James 3
1 1982 James 3
1 1983 James 4
1 1984 James 4
1 1985 James 1
1 1986 James 1
1 1987 James 1
2 1982 John 2
2 1983 John 2
2 1984 John 1
2 1985 John 1
I want to subset this data so that it only has the information for status=1 and the status right before that. I also want to eliminate multiple 1s and only save the first 1s. In conclusion I would want:
id year name status
1 1984 James 4
1 1985 James 1
2 1983 John 2
2 1984 John 1
I'm doing this because I'm in the process of figuring out in what year how many people from certain status changed to status 1. I only know the subset command and I don't think I can get this data from doing subset(data, subset=(status==1))
. How could I save the information right before that
I want to add to this question one more time - I did not get same results when I applied the first reply to this question (which uses plr packages) and the third reply which uses duplicated command. I found out that the first reply preserved information accurately while the third one did not.