I have some data in a table and I would like to do a non-equi join (I think is the right terminology) and filter it by different ranges for different groups. In the below example I would like to filter group "a" so that it only returns values between 1 and 20 (inclusive) and group "b" so it only returns values between 80 and 100 (inclusive). My reading suggests that inrange
should be the item to use. I get how to use it in the generic case, but I am not sure how to get it to run with different ranges by group. (example code adapted from ?inrange
)
create sample data
set.seed(1234)
Y = data.table(a=sample(1:100,100), val=runif(100), group=c(rep("a",50),rep("b",50)))
range = data.table(group=c("a","b"),start = c(1,80), end = c(20,100))
Try to filter
Y[inrange(a, range$start, range$end),,by=group]
This obviously does not work and instead applies those ranges to the entire dataset and throws the error message Ignoring by= because j= is not supplied
. I think it is clear to me that this is not working because I haven't created the 'join' between the range table and Y, but I am not seeing how to make the two tables communicate grouping via inrange
.
Note: In reality values in a will be posixct datetimes, but for the sake of simplicity I am not using that here.