I am working with the R programming language.
Suppose I have the following data:
set.seed(123)
n <- 100000
df <- data.frame(longitude = runif(n, -180, 180),
latitude = runif(n, -90, 90),
color = sample(c("red", "blue", "green", "orange", "purple", "yellow", "pink", "black", "white", "grey"), n, replace = TRUE))
I am trying to run the following function that identifies the convex hull of all points within the same color class. To do this, I am using the built-in chull() function within R along with the lapply function:
hulls <- lapply(unique(df$color), function(color) {
chull(df[df$color == color, c("longitude", "latitude")])
})
hull_sfs <- lapply(seq_along(hulls), function(i) {
st_as_sf(df[df$color == unique(df$color)[i], ][hulls[[i]], ],
coords = c("longitude", "latitude"), crs = 4326)
})
hull_sf_combined <- do.call(rbind, hull_sfs)
st_write(hull_sf_combined, "hulls.shp")
My Question: I am trying to explore different ways to improve the efficiency of this code. For instance, I am trying to see if I can use libraries such as parallel, doSNOW, foreach and functions such as mcapply() to improve the speed of this code.
But I am not sure where to begin - can someone please show me how to do this?
Thanks!