Edit: I tried the following nested loop but I keep getting errors:
install.packages("hutilscpp")
library(hutilscpp)
# Initialize an empty list to store the matching results
matching_results <- list()
# Nested for loop to match each point in dataset1 with dataset2
for (i in 1:nrow(dataset)) {
min_distance <- Inf
matching_row <- NULL
for (j in 1:nrow(dataset2)) {
distance <- match_nrst_haversine(dataset[i, c(lat lon)], dataset2[j, c(lat2 lon2)])
# Update the minimum distance and matching row if a closer point is found
if (distance < min_distance) {
min_distance <- distance
matching_row <- dataset2[j, ]
}
}
matching_results[[i]] <- cbind(dataset[i, ], matching_row)
}
I get the following error:
Error: unexpected symbol in:
" for (j in 1:nrow(dataset2)) {
distance <- match_nrst_haversine(dataset[i, c(lat lon"
I tried different syntax but nothing has worked. Thanks again.
Original question:
I have two datasets, one is a household survey with geolocations and the other is a climate dataset. I took a screenshot to illustrate, the points are the households, that fall within the climate data grid.
As you can see, the latitudes and longitudes are not the same. How can I merge them in R while keeping the original size of the household dataset (so with duplicates)?
My household dataset looks like this:
lat lon year hhid indiv
5.535456 7.531536 2010 10001 4
5.535456 7.531536 2010 10001 5
5.535456 7.531536 2010 10001 2
5.535456 7.531536 2010 10001 1
5.535456 7.531536 2010 10001 6
5.535456 7.531536 2010 10001 7
5.535456 7.531536 2010 10001 3
And here is the climate data:
| lat | lon |SPEI
| -------- | -------- |-------
| 4.25 | 13. 25 |1.14703
| 4.75 | 13. 25 |0.961421
The final dates would look like this:
lat lon year hhid indiv SPEI
5.535456 7.531536 2010 10001 4 1.14703
5.535456 7.531536 2010 10001 5 1.14703
5.535456 7.531536 2010 10001 2 1.14703
5.535456 7.531536 2010 10001 1 1.14703