I am trying to generate a grid for a given (multi) polygon. I understand a grid as a collection of h3 indices within a (multi)polygon boundary.
Here is the code that I implemented so far:
def generate_grid(region_bounds: gpd.GeoDataFrame) -> pd.DataFrame:
"""
Generates H3 resolution 10 grid. It utilizes the h3.polyfill method.
For more detail see https://geographicdata.science/book/data/h3_grid/build_sd_h3_grid.html
Returns: a dataframe with the following columns:
<index, h3_res_10>
"""
logging.info("Start grid generation")
start = time.time()
resolution = 10
grid = region_bounds.h3.polyfill(resolution)
end = time.time()
logging.info(f"grid generation took {end - start} sec")
# convert polyfill result to df
grid_df = pd.DataFrame.from_dict({"h3_res_10": grid.h3_polyfill[0]})
logging.info(grid_df)
return grid_df
The problem occurs when the region to process is big, like a country or state. Is there any way to run the polyfill in parallel for multiple subregions? How can I efficiently split the region into subregions to run h3.polyfill in parallel?