I have two datasets that are formatted like this:
df1
#> Artist Album Year
#> 1 Beatles Sgt. Pepper's 1967
#> 2 Rolling Stones Sticky Fingers 1971
and
df2
#> Album Year Producer
#> 1 Sgt. Pepper's 1966 George Martin
#> 2 Sticky Fingers 1971 Jimmy Miller
I'd like to do an inner_join
by Album and Year, but sometimes the 'Year' field is off by one year: for example, Sgt. Peppers is listed as 1967 in df1 by 1966 in df2.
So if I run:
df3 <- inner_join(df1, df2, by = c("Album", "Year"))
I get:
df3
#> Artist Album Year Producer
#> 1 Rolling Stones Sticky Fingers 1971 Jimmy Miller
Whereas, I would like both albums to join, so long as something like (df1$Year == df2$Year + 1)|(df1$Year == df2$Year - 1)
.
I cannot just simple join by 'Album' because in my real data set there are some identically titled 'Albums' that are distinguished by 'Year'.
Code for the datasets below:
df1 <- data.frame(stringsAsFactors=FALSE,
Artist = c("Beatles", "Rolling Stones"),
Album = c("Sgt. Pepper's", "Sticky Fingers"),
Year = c(1967, 1971)
)
df1
df2 <- data.frame(stringsAsFactors=FALSE,
Album = c("Sgt. Pepper's", "Sticky Fingers"),
Year = c(1966, 1971),
Producer = c("George Martin", "Jimmy Miller")
)
df2