1

I have documents in elasticsearch that are something like:

    {
      "numberOfBedrooms": 2,
      "price": 1500,
      "type": flat
    }

I would like to get statistics like what is the average price by room, what is the average price by type and also combinations like what is the average price per numberOfBedroom+type combinations. How can I use aggregations in elastic search to achieve that?

Thanks!

hsusanoo
  • 774
  • 7
  • 18
ip.
  • 3,306
  • 6
  • 32
  • 42
  • I can probably do it manually, but I was wondering if there was a more elegant solution. – ip. Dec 20 '14 at 21:23

1 Answers1

2

To show the average price by number of rooms, create a query with two levels of nested aggregation.

  • 1st level - use term aggregation to step through the bedroom sizes
  • 2nd level - calculate the average price for the current bedroom size

For example this query:

curl -XGET 'http://localhost:9200/myindex/houses/_search?pretty&search_type=count' -d '{
   "query" : {
        "match_all" : { }
   },
   "aggs": {
     "bed_agg": {
       "terms": {"field": "numberOfBedrooms"},        
         "aggs" : {
             "avg_price" : { "avg" : { "field" : "price" } }
         }
       }
     }
   }
 }'

Should return something like:

  "aggregations" : {
    "bed_agg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ {
    "key" : 2,
    "doc_count" : 2,
    "avg_price" : {
      "value" : 1750.0
    }
      }, {
    "key" : 3,
    "doc_count" : 1,
    "avg_price" : {
      "value" : 2100.0
    }
      } ]
    }

To do aggregation over another level (e.g. adding in the building type) you can either create a new level of aggregation - e.g. nest the "type" aggregation inside the "bedrooms" aggregation.

curl -XGET 'http://localhost:9200/myindex/houses/_search?pretty&search_type=count' -d '{
  "query" : {
       "match_all" : { }
  },
  "aggs": {
    "bed_agg": {
      "terms": {"field": "numberOfBedrooms"},
      "aggs": {
      "type_agg": {
        "terms": {"field": "type"},          
        "aggs" : {
          "avg_price" : { "avg" : { "field" : "price" } }
         }
        }
       }
      }
    }
  }
}'

Alternatively you can create a single bucket with both fields in it using a script:

 "aggs": {
        "bed_type_agg": {
          "terms": { "script" : "doc[\"numberOfBedrooms\"].value+doc[\"type\"].value"},       
              "aggs" : {
               "avg_price" : { "avg" : { "field" : "price" } }
            }
          }
        }
Olly Cruickshank
  • 6,120
  • 3
  • 33
  • 30
  • Amazing! Thanks a lot. One other question though. If I want to have buckets for 1,2,3,4 bedrooms and 5+ bedrooms. Is there any way to factor that in? – ip. Dec 20 '14 at 22:46
  • 1
    I would use [range aggregation](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-range-aggregation.html) - your first 4 aggregations will be small and the final one will be 5 and higher. – Olly Cruickshank Dec 21 '14 at 07:04
  • How can I do the ranges in combination with the type? – ip. Dec 21 '14 at 23:54
  • Not sure what you want - if you nested the bedroom ranges inside the property Type, would that give you what you wanted? – Olly Cruickshank Dec 22 '14 at 07:22