I have two dataframes (df_1 and df_2) containing one variable each (var1 and var2). They are both global maps, in the same crs and at the same spatial resolution (0.083 on the global grid):
Var1
Var2.
If it can help, I created both dataframes starting from NetCDF files at different spatial resolutions and extents, using package terra
.
I need the two variables to be in the same dataframe to perform analyses on them. However, I've been trying to merge the dataframes based on spatial coordinates (columns lon and lat) but it returns a column of NAs.
- I tried first with
join
fromdplyr
:
> df <- df_1 %>%
+ left_join(df_2, by = c("lon", "lat"))
> summary(df)
lon lat var1 var2
Min. :-179.96 Min. :-55.458 Min. : 0.000 Min. : NA
1st Qu.: -63.04 1st Qu.: 8.208 1st Qu.: 9.398 1st Qu.: NA
Median : 31.21 Median : 37.791 Median :15.052 Median : NA
Mean : 20.81 Mean : 30.864 Mean :16.223 Mean :NaN
3rd Qu.: 93.12 3rd Qu.: 56.791 3rd Qu.:21.898 3rd Qu.: NA
Max. : 179.96 Max. : 83.041 Max. :62.238 Max. : NA
NA's :2075418
- I converted them to
sf
and then usedst_join
:
> sf_1 <- st_as_sf(df_1,
+ coords = c("lon", "lat"),
+ crs = 4326)
> sf_2 <- st_as_sf(df_2,
+ coords = c("lon", "lat"),
+ crs = 4326)
> df_spatialjoin <- st_join(sf_1, sf_2)
> summary(df_spatialjoin)
var1 var2 geometry
Min. : 0.000 Min. : NA POINT :2075418
1st Qu.: 9.398 1st Qu.: NA epsg:4326 : 0
Median :15.052 Median : NA +proj=long...: 0
Mean :16.223 Mean :NaN
3rd Qu.:21.898 3rd Qu.: NA
Max. :62.238 Max. : NA
NA's :2075418
- I tried to convert the
sf
back to tibble and then usedplyr
:
> df_join <- left_join(as_tibble(sf_1),
+ as_tibble(sf_2),
+ by = "geometry")
> summary(df_join)
var1 geometry var2
Min. : 0.000 POINT :2075418 Min. : NA
1st Qu.: 9.398 epsg:4326 : 0 1st Qu.: NA
Median :15.052 +proj=long...: 0 Median : NA
Mean :16.223 Mean :NaN
3rd Qu.:21.898 3rd Qu.: NA
Max. :62.238 Max. : NA
NA's :2075418
I also tried geo_join
from fuzzyjoin
as explained in this thread but it also didn't work.
I had this problem in the past, but I would solve it using cbind()
instead of any join()
function. Unfortunately, here cbind()
won't work as the two dataframes originate from different NetCDF files and have thus different row numbers.