1

Let's say I have the following data:

{
  "id":"1",
  "name": "John",
  "tag":"x"
},
{
  "id": 2,
  "name":"John",
  "tag":"y"
},
{
  "id": 3,
  "name":"Jane",
  "tag":"x"
}

I want to get the count of documents (unique on name) that has both tag = "x" and tag = "y"

Given the above data, the query should return 1, because only John has two documents exists that has the two required tags.

What I am able to do so far is a query that uses OR ( so either tag = "x" or tag = "y") which will return 2. For example:

"aggs": {
  "distict_count": {
    "filter": {
      "terms": {
        "tag": [
          "x",
          "y"
        ]
      }
    },
    "aggs": {
      "agg_cardinality_name": {
        "cardinality": {
          "field": "name"
        }
      }
    }
  }
}

Would it be possible to change that to use and instead of or?

Hossam Khamis
  • 1,133
  • 8
  • 13

1 Answers1

1

Try putting cardinality under a terms agg to get proper distinct counts:

{
  "size": 0,
  "aggs": {
    "distict_count": {
      "filter": {
        "terms": {
          "tag": [
            "x",
            "y"
          ]
        }
      },
      "aggs": {
        "agg_terms": {
          "terms": {
            "field": "name"
          },
          "aggs": {
            "agg_cardinality_name": {
              "cardinality": {
                "field": "name"
              }
            }
          }
        }
      }
    }
  }
}

CORRECTION

You can use a combination of cardinality aggs with a bucket_selector which'll rule out buckets where there are fewer than 2 unique tags -- i.e. both x and y:

{
  "size": 0,
  "aggs": {
    "distict_count": {
      "filter": {
        "terms": {
          "tag": [
            "x",
            "y"
          ]
        }
      },
      "aggs": {
        "agg_terms": {
          "terms": {
            "field": "name"
          },
          "aggs": {
            "agg_cardinality_tag2": {
              "bucket_selector": {
                "buckets_path": {
                  "unique_tags_count": "unique_tags_count"
                },
                "script": "params.unique_tags_count > 1"
              }
            },
            "unique_tags_count": {
              "cardinality": {
                "field": "tag"
              }
            },
            "unique_names_count": {
              "cardinality": {
                "field": "name"
              }
            }
          }
        }
      }
    }
  }
}
Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68
  • Thanks Joe, but that doesn't change the fact that the filter is using "or" and not "and". I didn't experience issues with the cardinality the way it is now, but I mainly wanted to change the behaviour from using "or" to using "and". – Hossam Khamis Oct 13 '20 at 09:05
  • You can always use a combination of bool-musts like so: https://stackoverflow.com/a/64250297/8160318. But none of your documents will match for both x AND y... – Joe - GMapsBook.com Oct 13 '20 at 09:18