How to make moving average using geopandas nearest neighbors?

Question

I have a geodataframe ("GDF") with one column as "values", and another column as "geometry" (that in fact are actual geographical regions), so each row represents a region.

The "values" column is zero in many rows, and a large number in some rows.

I need to make a "moving average" or rolling average, using the nearest neighbors up to a certain "max_distance" (we can assume that the GDF has a locally projected CRS, so the max_distance has real meaning). Thus, the averaged_values would have neither zero or large values in most of the regions, but an average value.

One way to do it would be

for region in GDF:
    averaged_values=sjoin_nearest(GDF,GDF,maxdistance=1000).values.mean()

But really I don't know how to proceed.

The expected output would be a geodataframe with 3 columns: "values", "averaged_values", and "geometry".

Any ideas?

Please add more information such as the geodataframe and some code showing what you have attempted. — DPM, May 31 '22 at 18:03

score 3 · Accepted Answer · answered May 31 '22 at 18:38

What you are trying to do is also called a spatial lag. The best way is to create spatial weights matrix based on a set distance and compute the lag, both using libpysal library, which is a part of the geopandas ecosystem.

import libpysal

# create weights
W = libpysal.weights.DistanceBand.from_dataframe(gdf, threshold=1000)

# row-normalise weights
W.transform = "r"

# create lag
gdf["averaged_values"] = libpysal.weights.lag_spatial(W, gdf["values"])

How to make moving average using geopandas nearest neighbors?

1 Answers1