0

Is there any way to create pagination over nested objects in Elastisearch? I the example data bellow I search offices (nested objects) and need to load only nested matches for requested pagination page. For example, for page 3 I need to load 4 items from offset 8 (office.3D, office.3E, office.4A, office.4B) + total count of matching nested items.

|-company.1
|   |-name
|   |-address
|   |-offices       __ __
|       |-office.1A   |
|-company.2           |
|   |-name            |
|   |-address         | page 1, size=4, offset=0
|   |-offices         |
|       |-office.2A   |
|       |-office.2B   |
|       |-office.2C __|__
|-company.3           |
|   |-name            |
|   |-address         |
|   |-offices         | page 2, size=4, offset=4
|       |-office.3A   |
|       |-office.3B   |
|       |-office.3C   |
|       |-office.3D __|__
|       |-office.3E   |
|-company.4           |
|   |-name            |
|   |-address         | page 3, size=4, offset=8
|   |-offices         |
|       |-office.4A   |
|       |-office.4B   |
|       |-office.4C __|__
|-company.5
...

Mapping:

{
    "mappings": {
        "properties": {
            "name":    { "type" : "keyword" }
            "address": { "type" : "keyword" }
            "offices": {
                "type": "nested",
                "properties": {
                    "hash":   { "type": "keyword" },
                    "street": { "type": "keyword" },
                    "city":   { "type": "keyword" },
                    "zip":    { "type": "keyword" },
                }
            }
        }
    }
}

The problem is, that loading page 2 (offset=4) should load root document company.3 in the example above. But Elasticsearch does not seem to support offset for nested fields, only for parent documents, or within nested objects for inner_hits.

I tried many queries and aggregates, but currently the only working solution is loading all matching nested objects along with root/parent document and loop over all of them until I reach offset for nested fields 0... nested offset. This is however very ineffective and even with allowing only maximum 500 pages (20 items per page size, means 10.000 default ES limit) I still may hit memory limits. Database has up to 2 mil. records of root documents, and 600k nested office records.

Currently I use following query - example:

{
    query: {
      bool: {
        must: {
          0: {
            nested: {
              path: office
              inner_hits: { size: 20 }
              query: {
                bool: {
                  must: {
                    0: {
                      wildcard: {
                        office.city: Bratislava*
                      }
    }}}}}}}}}
    aggregations: {
      cnt-total: {
        nested: {
          path: office
        aggs: {
          cnt-matching: {
            filter: {
              bool: {
                must: {
                  0: {
                    wildcard: {
                      office.city: Bratislava*
                    }
    }}}}}}}}}
}

This will give me correct total number of matching nested documents. But then I must loop over all loaded parent documents and calculate nested offset via script. Is there more effective way to do this?

lubosdz
  • 4,210
  • 2
  • 29
  • 43
  • I think this not possible. Look [here](https://discuss.elastic.co/t/filtering-sorting-and-paginate-nested-object-list/279078/2?u=rabbit_br) – rabbitbr Apr 16 '23 at 03:00
  • @rabbitbr I am suspecting that too ... it's just confusing, that ES provides so many ways to aggregate, scrolled views etc .. that I am not sure whether I missed something. Thank you for your comment. – lubosdz Apr 16 '23 at 14:30

0 Answers0