I have the following data set:
PATH = c("5-8-10-8-17-20",
"56-85-89-89-0-15-88-10",
"58-85-89-65-49-51")
INDX = c(18, 89, 50)
data.frame(PATH, INDX)
PATH | INDX |
---|---|
5-8-10-8-17-20 | 18 |
56-85-89-89-0-15-88-10 | 89 |
58-85-89-65-49-51 | 50 |
The column PATH has strings that represent a numerical series and I want to be able to pick the largest number from the string that satisfies PATH <= INDX
, that is selecting a number from PATH that is equal to INDX
or the largest number from PATH
that is yet less than INDX
my desired output would look like this:
PATH | INDX | PICK |
---|---|---|
5-8-10-8-17-20 | 18 | 17 |
56-85-89-89-0-15-88-10 | 89 | 88 |
58-85-89-65-49-51 | 50 | 49 |
Some of my thought-process behind the answer:
I know that If I have a function such strsplit
I could separate each string by "-"
, arrange by number and then subtract with INDX
and thus select the smallest negative number or zero. However, the original dataset is quite large and I wonder if there is a faster or more efficient way to perform this task.