These are subsets of two dataframes.
df1:
plot | mean_first_flower_date | gdd |
---|---|---|
1 | 2019-07-15 | 60 |
1 | 2019-07-21 | 50 |
1 | 2019-07-23 | 78 |
2 | 2019-05-13 | 100 |
2 | 2019-05-22 | 173 |
2 | 2019-05-25 | 245 |
(cont.)
df2:
plot | date | flowers |
---|---|---|
1 | 2019-07-12 | 2 |
1 | 2019-07-13 | 9 |
1 | 2019-07-14 | 3 |
1 | 2019-07-15 | 3 |
2 | 2019-05-12 | 10 |
2 | 2019-05-13 | 10 |
2 | 2019-05-14 | 14 |
2 | 2019-05-15 | 17 |
(cont.)
df2 has some matching dates with df1 but sometimes the dates are off for one or a couple days (highlighted in bold).
I would like to group both dfs based on both 'date' and 'plot', keeping df2, without losing 'gdd' data from df1.
This will happen if, for example, I inner_join both dfs because the dates will not match.
So if a date in df1 is one to three days earlier or later than what it's possible to match in df2, it's fine because the dates are relatively close. This is tricky because I want this data replacement only if there is not data available in df1 for that data range.
My goal is to have something like this:
plot | date | flowers | gdd |
---|---|---|---|
1 | 2019-07-12 | 2 | 60 |
1 | 2019-07-13 | 9 | 60 |
1 | 2019-07-14 | 3 | 60 |
1 | 2019-07-15 | 3 | 60 |
2 | 2019-05-12 | 10 | 100 |
2 | 2019-05-13 | 10 | 100 |
2 | 2019-05-14 | 14 | 100 |
2 | 2019-05-15 | 17 | 100 |
Is it possible to do?
I greatly appreciate any help! Thanks!