0

I have the index I attached below.

Each doc in the index holds the name and height of Alice or Bob and the age at which the height was measured. Measurements taken at the age of 10 are flagged as "baseline_height_at_age_10": true

I need to do the following:

  1. Find the height of Alice and Bob at age 10.
  2. List item Return for Alice and Bob, the records where the height is lower than their height at age 10.

So my question is: Can Painless do such type of search? I'd appriciate if you could point me at a good example for that.

Also: Is ElasticSearch Painless even a good approach for this problem? Can you sugges

The Index Mappings

PUT /shlomi_test/
{
  "mappings": {
    "_doc": {
      "properties": {
        "first_name": {
          "type": "keyword",
          "fields": {
            "raw": {
              "type": "text"
            }
          }
        },
        "surname": {
          "type": "keyword",
          "fields": {
            "raw": {
              "type": "text"
            }
          }
        },
        "baseline_height_at_age_10": {
          "type": "boolean"
        },
        "age": {
          "type": "integer"
        },
        "height": {
          "type": "integer"
        }
      }
    }
  }
}

The Index Data

POST /test/_doc/alice_green_8_110
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 8,
  "height": 110,
  "baseline_height_at_age_10": false
}

POST /test/_doc/alice_green_10_120
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 10,
  "height": 120,
  "baseline_height_at_age_10": true
}

POST /test/_doc/alice_green_13_140
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 13,
  "height": 140,
  "baseline_height_at_age_10": false
}

POST /test/_doc/alice_green_23_170
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 23,
  "height": 170,
  "baseline_height_at_age_10": false
}



POST /test/_doc/bob_green_8_120
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 8,
  "height": 120,
  "baseline_height_at_age_10": false
}

POST /test/_doc/bob_green_10_130
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 10,
  "height": 130,
  "baseline_height_at_age_10": true
}

POST /test/_doc/bob_green_15_160
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 15,
  "height": 160,
  "baseline_height_at_age_10": false
}

POST /test/_doc/bob_green_21_180
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 21,
  "height": 180,
  "baseline_height_at_age_10": false
}
RaamEE
  • 3,017
  • 4
  • 33
  • 53

2 Answers2

1

You should be able to do it just using aggregations. Assuming people only ever get taller, and the measurements are accurate, you could restrict the query to only those documents aged 10 or under, find the max height of those, then filter the results of those to exclude the baseline result

POST test/_search
{
  "size": 0,
  "query": {
    "range": {
      "age": {
        "lte": 10
      }
    }
  },
  "aggs": {
    "names": {
      "terms": {
        "field": "first_name",
        "size": 10
      },
      "aggs": {
        "max_height": {
          "max": {
            "field": "height"
          }
        },
        "non-baseline": {
          "filter": {
            "match": {
              "baseline_height_at_age_10": false
            }
          },
          "aggs": {
            "top_hits": {
              "top_hits": {
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}
kjr
  • 131
  • 4
  • Thanks. Your solution relies on the query that the age is <=10. I am looking for a way to take the value of height from the the record where age is =10 and then use it that value to select the other records, so this solution is not what I am looking for. From your experience is what I am trying to do even possible? – RaamEE Apr 21 '19 at 09:54
1

I've posted the same question, with emphasis on Painless scripting, ElasticSearch Support Forum How to find records matching the result of a previous search using ElasticSearch Painless scripting

and the answer was:

"I don't think the Painless approach will work here. You cannot use the results of one query to execute a second query with Painless.

The two-step approach that you outline at the end of your post is the way to go."

The bottom line is that you cannot use a result from one query as an input to another query. You can filter and aggregate and more, but not this.

So the approcah is pretty much as follows:

according to my understanding, suggests to do the 1st search, process the data and do an additional search. This basically translates to:

  1. Search the record where first_name=Alice and baseline_height_at_age_10=True.
  2. Process externally, to extract the value of height for Alice at age 10.
  3. Search for Alice's records where her height is lower than the value calculated externally.
RaamEE
  • 3,017
  • 4
  • 33
  • 53