0

I need to get last registered record for a serial number-or serial numbers- in a restricted area, also I should clustering these records upon zoom precision in map. I'm using Elasticsearch and I've mapped my document in this shape:

{
  "mappings": {
    "AssetStatus": {
      "properties": {
        "location": {
          "type": "geo_point"
        },
        "createdate": {
          "type": "date"
        },
        "serialnumber": {
          "type": "text",
          "fielddata": "true"
        }
      }
    }
  }
}

thus, I've write this query.

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "serialnumber": "sn2"
          }
        },
        {
          "geo_bounding_box": {
            "location": {
              "top_left": "52.4, 4.9",
              "bottom_right": "52.3, 5.0"
            }
          }
        }
      ],
      "must_not": [],
      "should": []
    }
  },
  "from": 0,
  "size": 0,
  "aggregations": {
    "SerialNumberGroups": {
      "terms": {
        "field": "serialnumber"
      },
      "aggs": {
        "tops": {
          "top_hits": {
            "sort": [
              {
                "createdate": {
                  "order": "desc"
                }
              }
            ],
            "size": 1
          },
          "aggs": {
            "geohash_grid": {
              "field": "location",
              "precision": 12
            }
          }
        }
      }
    }
  }
}

In this query, at the first I restrict documents depend on their serial numbers and their location, thus I group by query by serial number and order by createdate to get last registered record of each serial number in the area. the problem is in last part of query, when I should cluster result with geohash_grid. I Get this error

"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "Expected [START_OBJECT] under [field], but got a [VALUE_STRING] in [geohash_grid]",
"line": 1,
"col": 374
}
],
"type": "parsing_exception",
"reason": "Expected [START_OBJECT] under [field], but got a [VALUE_STRING] in [geohash_grid]",
"line": 1,
"col": 374
},
"status": 400
  • You should explain what happens in the geohash_grid aggregation? What do you get and what do you expect instead? – Val Jan 21 '19 at 10:20
  • I want to clustering the result query with [geohash_grid](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geohashgrid-aggregation.html). we are working on a fleetmanagement system and for showing points on map we have to cluster points – Ali Akbar Jahanbin Jan 21 '19 at 10:24
  • And what do you get? – Val Jan 21 '19 at 10:58
  • I update the question, put the error. – Ali Akbar Jahanbin Jan 21 '19 at 11:05
  • I think last aggregation knows location property in result of previous aggregation as string instead of geo_point data type – Ali Akbar Jahanbin Jan 21 '19 at 11:10
  • 1
    No, the problem is different, top_hits is a metric aggregation and thus cannot contain sub-aggregations, so you cannot nest a geohash_grid aggregation inside a top_hits one. Plus another issue is that your geohash_grid aggregation has no name. – Val Jan 21 '19 at 11:43
  • Thank you for your consideration, do you have any solution for my problem? – Ali Akbar Jahanbin Jan 21 '19 at 11:58
  • If I understand correctly, only the most recent asset (each with a unique serial number) should be included in the query. each time you add a new document for a given asset, you should flag the last most recent one, so that you can simply exclude it from your query. That would work pretty much like you expect. – Val Jan 21 '19 at 12:16
  • I'm totally agree with you, we have plane to register last asset record on another db -Ignite- and it's our temporary solution. Thank's a lot @Val – Ali Akbar Jahanbin Jan 21 '19 at 12:28
  • How many records do you have approximately? You can still flag them in the current index in order to run your query – Val Jan 21 '19 at 12:38
  • Based on our customers -one car manufacturing company- we've expected in our first step we got more than 300000 devices(serial number) and approximately devices send us a record every second. it's really huge number I think. this query take last registered location of devices. I think I should work on clustering them in backend code (c#). – Ali Akbar Jahanbin Jan 21 '19 at 12:59
  • Maybe you should use the serial number as the id of your document, so you'd only ever have one single document for each device and that one is the latest. Then your query would work as it is (but without the top_hits aggregation) – Val Jan 21 '19 at 13:01
  • Elasticsearch engine is based on Lucene library and as you can find in its documentation, it's good for create and read but it hasn't a real delete and update. for large scale systems like our project it could be bottleneck. Elasticsearch does not really update and delete it's junk data until shrink command and these data will remain in memory. – Ali Akbar Jahanbin Jan 21 '19 at 13:24
  • ES updates/deletes do indeed flag the document for deletion, which happens every once in a while behind the scene. I was thinking of having another index with just the latest document for each device. You can build such an index from the main index very easily with the reindex API and then you could run your query very easily on that new index. – Val Jan 21 '19 at 13:27

0 Answers0