I am trying to come up with a way to loop through a data frame and recognise points shared between 2 columns and work iteratively to assign a unique factor to these. Specifically, I have a data frame indicating points along a river and which points are immediately upstream of this.
Here is some example data:
df <- data.frame(RiverID = rep(c(1,2), each = 15), SiteID = rep((c(2,3,4,5,6,7,8,9,10,11,12,13,13,13,13)),2),
Upstream_SiteID = rep((c(1,1,1,2,2,3,4,5,6,7,8,9,10,11,12)),2),
Dist2Mouth = rep((c(2000,2000,2000,1500,1500,1500,1500,1000,1000,1000,1000,500,500,500,500)),2))
And the basic paint image below might help explain the kind of data I have.
What I would like to do is identify all possible 'routes' through the system (orange letters in the image). So in the example, I would start from point 13, go 'upstream' in the dataframe (i.e. lower Distance2Mouth values) and recognise 4 different routes (A-D). I then need to iteratively keep working up the data frame and assign routes to each point.
There are some instances where the stretch of river between the two points could belong to two routes. For example in the image below, the section between 2-1 could be part of routes C or D. In these instances, I would like to create multiple rows with the same SiteID and Upstream_SiteID that list the different potential routes.
Lastly, I have these instances across various rivers, so I would like to loop through the dataframe and apply the above code for each RiverID.
Desired output (the correct routes apply to RiverID '1'):
output <- data.frame(RiverID = rep(c(1,2), each = 16),
SiteID = rep((c(2,2,3,4,5,6,7,8,9,10,11,12,13,13,13,13)),2),
Upstream_SiteID = rep((c(1,1,1,1,2,2,3,4,5,6,7,8,9,10,11,12)),2),
Dist2Mouth = rep((c(2000,2000,2000,2000,1500,1500,1500,1500,1000,1000,1000,1000,500,500,500,500)),2),
Route = as.factor(c(rep(c("D","C","B","A"),times = 4),rep(c("H","G","F","E"),times = 4))))