With data.table
you can to non-equal joins:
(added some extra columns and a name with no matches to example)
library(data.table)
data1<-data.table(name=c("Jessica_Smith","John_Smith","John_Smith", "Jackson_Brown") ,
max_year=c(2000,1989,2005,2020),
min_year=c(1990,1989,2001,1990),
extra_col1=c('a', 'b', 'c', 'd'))
data2<-data.table(name=c("Jessica_Smith","John_Smith","John_Smith") ,
year_of_birth=c(1995,1989,2002),
unique_id=c("aby37","nf93","fnd34"),
extra_col2=1:3)
data2[data1, .(name, year_of_birth = x.year_of_birth, unique_id,
extra_col1, extra_col2),
on = .(name, year_of_birth >= min_year, year_of_birth <= max_year)]
name year_of_birth unique_id extra_col1 extra_col2
1: Jessica_Smith 1995 aby37 a 1
2: John_Smith 1989 nf93 b 2
3: John_Smith 2002 fnd34 c 3
4: Jackson_Brown NA <NA> d NA
I think it's better to list all the column names you want to include in the result.
One difficulty with with the non-equal joins is the default renaming of "matching" columns. The columns match the non-equal criteria, but they are not the same. This is why I specified x.year_of_birth (you can use x. or i. to pick which data.table the column comes from in x[i]).
Here is the result with no columns selected (which normally gives you all columns in a join):
data2[data1,
on = .(name, year_of_birth >= min_year, year_of_birth <= max_year)]
name year_of_birth unique_id extra_col2 year_of_birth.1 extra_col1
1: Jessica_Smith 1990 aby37 1 2000 a
2: John_Smith 1989 nf93 2 1989 b
3: John_Smith 2001 fnd34 3 2005 c
4: Jackson_Brown 1990 <NA> NA 2020 d
Notice that year_of_birth here is really min_year, and year_of_birth.1 is really max_year. So this is confusing and data2$year_of_birth is not included because it's considered a "matched" column that doesn't need to be kept.