After converting my GeoDataFrame with 120,000+ rows of point objects to one giant ee.FeatureCollection and trying to map() over it for topographical attributes, I found out Google Earth Engine queries will cut out after accumulating more than 5000 elements. I have gotten two related errors with different functions:
Request payload size exceeds the limit: 10485760 bytes. Collection query aborted after accumulating over 5000 elements
What I've done is break my GeoDataFrame into smaller GeoDataFrames of 5000 rows each, and use geemap.geopandas_to_ee() to convert each of them to an ee.FeatureCollection, and put each of those into a list to iterate over. With each iteration I assigned the results to a placeholder GeoDataFrame, and concatenated it with another GeoDataFrame holding results from previous iterations.
featCol_list = [
geemap.geopandas_to_ee(gdf[0:5000]),
geemap.geopandas_to_ee(gdf[5000:10000]),
geemap.geopandas_to_ee(gdf[10000:15000]),
...
geemap.geopandas_to_ee(gdf[115000:120000]),
geemap.geopandas_to_ee(gdf[120000:])]
def get_topo(feat):
elevation_img = ee.Image('USGS/SRTMGL1_003').select('elevation')
slope_img = ee.Terrain.slope(ee.Image('USGS/SRTMGL1_003')).select('slope')
aspect_img = ee.Terrain.aspect(ee.Image('USGS/SRTMGL1_003')).select('aspect')
elevation = elevation_img.sample(feat.geometry(),scale=10).first().get('elevation')
slope = slope_img.sample(feat.geometry(),scale=10).first().get('slope')
aspect = aspect_img.sample(feat.geometry(),scale=10).first().get('aspect')
return feat.set({'elevation':elevation,
'slope':slope,
'aspect':aspect})
gdf_all = gpd.GeoDataFrame()
for featCol in featCol_list:
topoCol = featCol.map(get_topo)
gdf = geemap.ee_to_geopandas(topoCol)
gdf = gdf.set_crs('EPSG:4326')
# Concatenate new data with placeholder GeoDataFrame
gdf_all = gpd.GeoDataFrame(pd.concat([gdf.copy(),gdf_all.copy()], ignore_index=True), crs='EPSG:4326')
My brute force method got me the results in what I thought was a reasonable amount of time (~30min), but it's ugly and tedious, and I definitely don't want to do this for a GeoDataFrame with 2 million rows. Is there a more elegant way to map() over a FeatureCollection with more than 5000 Features, either server- or client-side?