0

I have the following mapping:

{
    "dynamic": "strict",
    "properties": {
        "id": {
            "type": "string"
        },
        "title": {
            "type": "string"
        },
        "things": {
            "type": "nested",
            "properties": {
                "id": {
                    "type": "long"
                },
                "something": {
                    "type": "long"
                }
            }
        }
    }
}

I insert docs as follows (Python script):

body = {"id": 1, "title": "one", "things": [{"id": 1000, "something": 33}, {"id": 1001, "something": 34}, ]}
es.create(index_name, doc_type=doc_type, body=body, id=1)

body = {"id": 2, "title": "two", "things": [{"id": 1000, "something": 43}, {"id": 1001, "something": 44}, ]}
es.create(index_name, doc_type=doc_type, body=body, id=2)

body = {"id": 3, "title": "three", "things": [{"id": 1000, "something": 53}, {"id": 1001, "something": 54}, ]}
es.create(index_name, doc_type=doc_type, body=body, id=3)

I run following aggregation query:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "things": {
      "aggs": {
        "num_articles": {
          "terms": {
            "field": "things.id",
            "size": 0
          },
          "aggs": {
            "articles": {
              "top_hits": {
                "size": 50
              }
            }
          }
        }
      },
      "nested": {
        "path": "things"
      }
    }
  },
  "size": 0
}

(so, I want a count of no. of times each "thing" appears, and against each thing a list of the articles in which each thing appears)

The query produces:

"key": 1000,
"doc_count": 3,
"articles": {
    "hits": {
        "total": 3,
        "max_score": 1,
        "hits": [{
            "_index": "test",
            "_type": "article",
            "_id": "2",
            "_nested": {
                "field": "things",
                "offset": 0
            },
            "_score": 1,
            "_source": {
                "id": 1000,
                "something": 43
            }
        }, {
            "_index": "test",
            "_type": "article",
            "_id": "1",
            "_nested": {
                "field": "things",
                "offset": 0
            },
            "_score": 1,
            "_source": {
                "id": 1000,
                "something": 33
            }

.... (and so on)

What I'd like is for each hit to list all the fields from the "outer" or top-level document i.e. in this case, id and title.

Is this actually possible ..... and if so how ???

Roland Dunn
  • 101
  • 3
  • 8

1 Answers1

0

I'm not sure if this is what you're looking for, but let's give it a try:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "nested_things": {
      "nested": {
        "path": "things"
      },
      "aggs": {
        "num_articles": {
          "terms": {
            "field": "things.id",
            "size": 0
          },
          "aggs": {
            "articles": {
              "top_hits": {
                "size": 50
              }
            },
            "reverse_things": {
              "reverse_nested": {},
              "aggs": {
                "title": {
                  "terms": {
                    "field": "title",
                    "size": 0
                  }
                },
                "id": {
                  "terms": {
                    "field": "id",
                    "size": 0
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

This produces something like this:

          "buckets": [
               {
                  "key": 1000,
                  "doc_count": 3,
                  "reverse_things": {
                     "doc_count": 3,
                     "id": {
                        "buckets": [
                           {
                              "key": "1",
                              "doc_count": 1
                           },
                           {
                              "key": "2",
                              "doc_count": 1
                           },
                           {
                              "key": "3",
                              "doc_count": 1
                           }
                        ]
                     },
                     "title": {
                        ...
                     }
                  },
                  "articles": {
                     "hits": {
                        "total": 3,
                        "max_score": 1,
                        "hits": [
                           {
                              "_index": "test",
                              "_type": "article",
                              "_id": "AVPOgQQjgDGxUAMojyuY",
                              "_nested": {
                                 "field": "things",
                                 "offset": 0
                              },
                              "_score": 1,
                              "_source": {
                                 "id": 1000,
                                 "something": 53
                              }
                           },
                           ...
  • The problem is that the ```reverse_things``` section lists id and title, but not in the same order. So, the key's for ID are 1,2,3 "id": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [{ "key": "1", "doc_count": 1 }, { "key": "2", "doc_count": 1 }, { "key": "3", "doc_count": 1 }] }, – Roland Dunn Apr 02 '16 at 12:55
  • But the key's for title, are one, three, two. "title": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [{ "key": "one", "doc_count": 1 }, { "key": "three", "doc_count": 1 }, { "key": "two", "doc_count": 1 }] } If the ordering could be forced to match with original articles, that would work. Thanks @kristian-ferkić by the way ... – Roland Dunn Apr 02 '16 at 12:58