1

I use: Elasticsearch 7.7 , Kibana 7.7

For example, lets take two indexes:

User index with simple mapping:

PUT /user_index
{
  "mappings": {
    "properties": {
      "user_id":    { "type": "text" },
      "user_phone":    { "type": "text" },
      "name":   { "type": "text"  }     
    }
  }
}

Check with simple mapping:

PUT /check_index
{
  "mappings": {
    "properties": {
      "user_id":    { "type": "text" },  
      "price":   { "type": "integer"  },
      "goods_count":  {"type": "integer"}
    }
  }
}

I want to build table visualization like that:

________________________________________________________________________
  user_id  |   user_phone  | average_price       |    sum_goods_count  |
___________|_______________|_____________________|______________________
     1     |       123     |       512           |         64          |
___________|_______________|_____________________|______________________
     2     |       456     |       256           |         16          | 
___________|_______________|_____________________|______________________

So my questions are:

  1. Is it real?

  2. Do I understand correctly that I need to query these two indexes, get a list of users, and then in a loop create shopping carts with checks?

ArtSav
  • 132
  • 1
  • 10
  • what is the use of `user_index` in getting your expected output? Because the fields that your expected output contains are present in `check_index` also. – ESCoder Sep 17 '20 at 10:24
  • @Bhavya Gupta, This is just an example, the real challenge is much more complex and I need to build a table with fields from both indices. I was hoping that someone would be able to give me an idea of how to implement it or how to approach this problem. – ArtSav Sep 17 '20 at 10:31

1 Answers1

2

First thing first, you should try to de-normalize data in ES as much as possible to get the best performance and capability offered by it, And I went through the samples provided by you and comments in the question and it seems it can be easily achieved in your use-case and shown in below example, by combining user and check index into single index.

Index mapping

{
    "mappings": {
        "properties": {
            "user_id": {
                "type": "text",
                "fielddata": "true"
            },
            "price": {
                "type": "integer"
            },
            "goods_count": {
                "type": "integer"
            }
        }
    }
}

Index Data:

With the index mapping defined above, index these three documents, where one document is having "user_id":"1" and 2 documents have "user_id":"2"

{
    "user_id":"1",
    "price":500,
    "goods_count":100
}
{
    "user_id":"2",
    "price":500,
    "goods_count":100
}
{
    "user_id":"2",
    "price":100,
    "goods_count":200
}

Search Query:

Refer to ES official documentation on Terms Aggregation, Top Hits aggregation, Sum aggregation and Avg aggregation to get detailed explanation.

{
  "size": 0,
  "aggs": {
    "user": {
      "terms": {
        "field": "user_id"
      },
      "aggs": {
        "top_user_hits": {
          "top_hits": {
            "_source": {
              "includes": [
                "user_id"
              ]
            }
          }
        },
        "avg_price": {
          "avg": {
            "field": "price"
          }
        },
        "goods_count": {
          "sum": {
            "field": "goods_count"
          }
        }
      }
    }
  }
}

Search Result:

{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
      
    ]
  },
  "aggregations": {
    "user": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "2",
          "doc_count": 2,
          "top_user_hits": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "stof_63925596",
                  "_type": "_doc",
                  "_id": "2",
                  "_score": 1.0,
                  "_source": {
                    "user_id": "2"
                  }
                },
                {
                  "_index": "stof_63925596",
                  "_type": "_doc",
                  "_id": "3",
                  "_score": 1.0,
                  "_source": {
                    "user_id": "2"
                  }
                }
              ]
            }
          },
          "avg_price": {
            "value": 300.0
          },
          "goods_count": {
            "value": 300.0
          }
        },
        {
          "key": "1",
          "doc_count": 1,
          "top_user_hits": {
            "hits": {
              "total": {
                "value": 1,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "stof_63925596",
                  "_type": "_doc",
                  "_id": "1",
                  "_score": 1.0,
                  "_source": {
                    "user_id": "1"
                  }
                }
              ]
            }
          },
          "avg_price": {
            "value": 500.0
          },
          "goods_count": {
            "value": 100.0
          }
        }
      ]
    }
  }
}

As you can see in the search results above, for "user_id":"2" the average price is (500+100)/2 = 300 and sum of goods_count is 100+200 = 300.

Similarly for "user_id":"1" the average price is 500/1 = 500 and sum of goods_count is 100.

Amit
  • 30,756
  • 6
  • 57
  • 88
  • Elasticsearch,thank you for the answer. I understood the concept, but in the real problem I will have to combine several large indexes, use nested objects, the number of which can reach thousands, and build tables with 10-15 columns in kibana. Can you tell me if the elasticsearch+kibana is suitable for this purpose or should I consider another solution? – ArtSav Sep 19 '20 at 15:10
  • @АртемСавельев, thanks for explaining your use case in more detail, but still I don't know the complete use case of your application but it seems wrong **at first limited sight and info**, As both aggreagation and nested docs are very costly in ES and there are several article around them, on aggregation please refer to my latest answer on SO https://stackoverflow.com/a/63965634/4039431 and for nested refer Gojek's medium blog https://blog.gojekengineering.com/elasticsearch-the-trouble-with-nested-documents-e97b33b46194 to get some idea of performance issue with them. – Amit Sep 19 '20 at 15:36