0

I am working on the GeoTile of Elastic search. After grouping the locations into buckets, I want to get the data in that bucket with pagination (using search after). Have anyone done on that, how can I achieve it? Thank you!

Here is the GeoTile aggregation I have used:

GET /index-name/_doc/_search
{
  "aggs": {
     "result": {
        "geotile_grid": {
          "field": "location",
          "precision": 12
        }
     }
   }
}

And the result look like:

{
  "took" : 3,
  "hits" : {
    "total" : {
      "value" : 39,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ... ]
  },
  "aggregations" : {
    "result" : {
      "buckets" : [
        {
          "key" : "12/3519/1597",
          "doc_count" : 36
        },
        {
          "key" : "12/3520/1597",
          "doc_count" : 3
        }
      ]
    }
  }
}

For example, how can I get 36 documents in the "12/3519/1597" bucket? Thank you!

I have already tried to convert between the GeoTile key "12/3519/1597" into a bounding box follow this article or used the GeoTileUtils from the ESearch code.

However, from the example above, the key "12/3519/1597" is converted to a bounding box, and when I query all the documents in that box, there were 2 buckets. The x=3520 bucket contains documents in the lon=129.375 which exactly lie on the right edge.

2 Answers2

1

You could nest top hits aggregation to get documents per geo tile buckets.

You could also use geo grid query to filter documents per tile.

GET kibana_sample_data_logs/_search
{
  "size": 1,
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "geo_grid": {
            "geo.coordinates": {
              "geotile": "5/9/12"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

Response

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 675,
      "relation": "eq"
    },
    "max_score": 0,
    "hits": [
      {
        "_index": ".ds-kibana_sample_data_logs-2023.07.12-000001",
        "_id": "NM-ISokB7DQkCI7yJZQ-",
        "_score": 0,
        "_source": {
          "agent": "Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1",
          "bytes": 8973,
          "clientip": "213.50.214.248",
          "extension": "rpm",
          "geo": {
            "srcdest": "US:VN",
            "src": "US",
            "dest": "VN",
            "coordinates": {
              "lat": 40.19349528,
              "lon": -76.76340361
            }
          },
          "host": "artifacts.elastic.co",
          "index": "kibana_sample_data_logs",
          "ip": "213.50.214.248",
          "machine": {
            "ram": 12884901888,
            "os": "win 8"
          },
          "memory": null,
          "message": "213.50.214.248 - - [2018-09-10T11:39:18.812Z] \"GET /beats/metricbeat/metricbeat-6.3.2-i686.rpm HTTP/1.1\" 200 8973 \"-\" \"Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1\"",
          "phpmemory": null,
          "referer": "http://www.elastic-elastic-elastic.com/success/daniel-tani",
          "request": "/beats/metricbeat/metricbeat-6.3.2-i686.rpm",
          "response": 200,
          "tags": [
            "success",
            "info"
          ],
          "@timestamp": "2023-08-21T11:39:18.812Z",
          "url": "https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-6.3.2-i686.rpm",
          "utc_time": "2023-08-21T11:39:18.812Z",
          "event": {
            "dataset": "sample_web_logs"
          },
          "bytes_gauge": 8973,
          "bytes_counter": 65621715
        }
      }
    ]
  }
}
Nathan Reese
  • 2,655
  • 6
  • 31
  • 34
0

For the newer ES version (since 8.8), you can use @Nathan Reese solution.

However, in the lower version (mine is 7.10), I have used GeoTileUtils of the Elastic search to convert from the geotile key (z/x/y) into the bounding box.

But you must aware of the edge of bounding box. The geotile aggregation does not take the location (point) on the right and bottom edge. To exclude the point on the edge, I used a painless script as follow:

GET /index-name/_doc/_search
{
  "size": 3,
  "query": {
    "bool": {
      "filter": [
        { 
          "geo_bounding_box": {
            "location": {
              "top_left": {
                "lat": 36.80928470205938, "lon": 129.287109375
                },
              "bottom_right": {
                "lat": 36.73888412439431, "lon": 129.37500
              }
            }
          }
        },
        {
          "script": {
            "script": {
              "source": "doc['location'].lon < params.maxLon && doc['location'].lat < params.minLat",
              "lang": "painless",
              "params": {
                "minLat": 36.80928470205938,
                "maxLon": 129.37500
              }
            }
          }
        }
      ]
    }
  }
}