Application-side Joins Elasticsearch

Question

I have two indexes in Elasticsearch, a system index, and a telemetry index. I'd like to perform queries and aggregations on the telemetry index using filters from the systems index. The systems index is relatively small and only receives new documents occasionally, but the telemetry index is much larger and is constantly receiving new documents. This seems like an ideal situation for using an application-side join.

I tried emulating the example query at the pervious link, but it turns out the filtered query is deprecated as of ES 5.0. (Why is this example in the current documentation?!)

Here are my queries:

GET /system/_search
{
  "query": {
    "match": {
      "name": "George's system"
    }
  }
}

GET /telemetry/_search
{
  "query": {
    "bool":{
      "must": {
        "multi_match": {
          "operator": "and",
          "fields": ["systemId"]
          , [1] }  
        }
      }
    }
  }
}

The second one fails with a json_parse_exception because for some reason it doesn't like the [ ] characters after "fields".

Can anyone provide a simple example of using application-side joins?

Once such a query is defined (perhaps in Kibana's Dev Tools console) is there a way to visualize it in Kibana?

Any update? I am trying to do the same thing however I cannot figure out a way to fill 2nd query with result from the 1st. — haneulkim, Dec 04 '19 at 01:26
I ended up making two calls, and dealing with everything in python. — James McKeown, Dec 05 '19 at 14:47
is there a way to send a list to elasticsearch therefore we can take advantage of kibana? — haneulkim, Dec 06 '19 at 07:15

score 1 · Answer 1 · answered Oct 15 '18 at 16:09

1

With elastic there is no way to execute two nested queries like in a relational database where the first query uses the response of the second. The example in the application-side join, means that you are actually making two queries (two different requests to elastic) on the application side.

First query you get the list of ids you need to filter on.
Second query you pass the list of ids that you got to the terms filter.

This works when you have no more than 1024 values for systemId. Because terms query has a limit on the number of terms.

Because this query is not feasible, then you can't visualize it in kibana.

In such case you have to sacrifice a little of space and add the systemId to your mapping.

Good Luck!

answered Oct 15 '18 at 16:09

ZiadM

374
2
6

thanks for the response. What you are saying is consistent with what I have read. Does it mean that the `[1]` in the second query is just pseudo code, or should it actually execute in Kibana's Dev Tools console? A similar question [here](https://stackoverflow.com/questions/35063578/elastic-search-application-side-join-pagination-and-aggregations) laments the fact that pagination is not supported for this type of query, but for my use case I was hoping to be able to make due with the 1024 values for systemId per query to the systems index. Guess I'll have to denormalize. :( – James McKeown Oct 16 '18 at 13:12
1

yes it it is just a pseudo code. Fields is just an array of the returned systemIds. Sacrificing a little space by adding the systemId to each record will give you better performance and usability. – ZiadM Oct 16 '18 at 21:31

Application-side Joins Elasticsearch

1 Answers1