0

Geohashes form a grid, much like a matrix.

If I have a table with a geohash7 and a value, how could I calculate say the average of all the adjacentcells on BigQuery?

Sample source table

geohash7 value
r1q6fh3 7
r1q6fh6 2
r1q6fh7 5
r1q6fh1 7
r1q6fh4 2
r1q6fh5 6
r1q6f5c 4
r1q6f5f 2
r1q6f5g 7

Sample output table

geohash7 value
r1q6fh4 4.6
... ...

Ideally in a scalable way as I want to run this over a large landmass.

TrewTzu
  • 1,110
  • 2
  • 11
  • 27
  • Could you clarify more on logic behind finding average of all the adjacent cells? Looking at your expected output, it's the average from r1q6fh3 to r1q6fh4? – Prajna Rai T Apr 28 '23 at 12:11

1 Answers1

1

First you'll need a function that returns neighbors. It is probably easier to grab one of javascript geohash libraries and use it as UDF, rather than do bit fiddling in SQL.

Once you have a function that returns neighbors, say

GeoHashNeibors(center STRING) RETURNS ARRAY<STRING>

You can then join original table with the flattened list of neighbors and compute average. Something like

WITH with_neighbors AS (
  select geohash, value, neighbor
  from data d, unnest(GeoHashNeibors(d.geohash)) neighbor
),
joined AS (
  select t1.geohash, t1.value, t1.neighbor, t2.value as n_value
  from with_neighbors t1 left join data t2
  on t1.neighbor = t2.geohash
)
SELECT geohash, ANY_VALUE(value) value, AVG(n_value) AS avg_neighbor
FROM joined 
GROUP BY 1
Michael Entin
  • 7,189
  • 3
  • 21
  • 26