I have 2 very large data frames similar to the following:
df1<-data.frame(DS.ID=c(123,214,543,325,123,214),OP.ID=c("xxab","xxac","xxad","xxae","xxaf","xxaq"),P.ID=c("AAC","JGK","DIF","ADL","AAC","JGR"))
> df1
DS.ID OP.ID P.ID
1 123 xxab AAC
2 214 xxac JGK
3 543 xxad DIF
4 325 xxae ADL
5 123 xxaf AAC
6 214 xxaq JGR
df2<-data.frame(DS.ID=c(123,214,543,325,123,214),OP.ID=c("xxab","xxac","xxad","xxae","xxaf","xxaq"),P.ID=c("AAC","JGK","DIF","ADL","AAC","JGS"))
> df2
DS.ID OP.ID P.ID
1 123 xxab AAC
2 214 xxac JGK
3 543 xxad DIF
4 325 xxae ADL
5 123 xxaf AAC
6 214 xxaq JGS
The unique id is based on the combination of the DS.ID and the OP.ID, so that DS.ID can be repeated but the combination of DS.ID and OP.ID will not. I want to find the instances where P.ID changes. Also, the combination of DS.ID and OP.ID will not necessarily be in the same row.
In the example above, it would return row 6, as the P.ID changed. I'd want to write both the initial and final values to a data frame.
I have a feeling the initial step would be
rbind.fill(df1,df2)
(.fill
because there's added columns in the data frames I'm trying to loop through).
Edit: Assume there's other columns that have different values as well. Thus, duplicated would not work unless you isolated them to their own data frame. But, I'll be doing this for many columns and many data frames, so I'd rather not go with that method for speed sake.