0

I am planning to store million of airbnb type apartments availabilty in elasticsearch . Where availabilty is an array that contains nested objects (availability type is nested). And each of those objects have date range, in which that apartment is available.

    apartments = [
  { 
    "_id": "kjty873yhekrg789e7r0n87e",
    "first_available_date": "2016-06-21",
    "availability": [
      {
        "start": "2016-06-21",
        "end": "2016-08-01"
      },
      {
        "start": "2016-08-20",
        "end": "2016-08-28"
      },
      {
        "start": "2016-10-03",
        "end": "2016-11-02"
      },
      { //This means it is available only for one day.
        "start": "2016-11-13",
        "end": "2016-11-13"
      },
      { 
        "start": "2016-11-28",
        "end": "2017-01-14"
      } 
    ],
    "apartment_metadata1": 56456,
    "apartment_metadata2": 8989,
    "status": "active"
  },
  { 
    "_id": "hgk87783iii86937jh",
    "first_available_date": "2016-06-09",
    "availability": [
      {
        "start": "2016-06-09",
        "end": "2016-07-02"
      },
      {
        "start": "2016-07-21",
        "end": "2016-12-19"
      },
      {
        "start": "2016-12-12",
        "end": "2017-07-02"
      }
    ],
    "apartment_metadata1": 23534,
    "apartment_metadata2": 24377,
    "status": "active"
  }
]

I would want to search apartments those are available for a specific date range (say 2016-08-20 to 2016-12-12). And that range should fall inside one of the availability date ranges of various apartments.

So I want to write a query, something like:

{
  "query": {
    "bool": {
      "must": [
        {
          "range": { "first_available_date": {"lte": "2016-08-20"} },
          "match": { "status": "active" }
        }
      ]
      },
      "filter": [
        {
          "range": 
            {
              "apartments.availability.start": {"gte": "2016-08-20"}, 
              "apartments.availability.end": {"lte": "2016-12-12"} 
            }
        }
     ]
    }
  }
}

And above query will return me both apartments (with MULTIPLE availability objects matching the condition), and that is incorrect, it should only return document with _id: hgk87783iii86937jh as there is EXACTLY one availability object matches the creiteria and that is {"start": "2016-07-21", "end": "2016-12-19"}. So in order to have correct result, the condition should be - there should be EXACTLY one availability object in apartment doc that should match the condition. So how to enforce that there should be EXACTLY one match in the above query? Second question - is my query even correct?

Community
  • 1
  • 1
JVK
  • 3,782
  • 8
  • 43
  • 67
  • make sure the mapping for `availability` is of type [nested](https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-mapping.html) and then you should be able to achieve using [nested query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html) – keety Jun 16 '16 at 20:25
  • @keety yes that I will have as I also mentioned in the post. But my question is how do I get only those documents where EXACTLY one availability object matches to the condition? – JVK Jun 16 '16 at 20:51
  • could you give an example of which `availbitly object` would satisfy the condition for the above case ? – keety Jun 17 '16 at 00:05
  • @keety I have updated the post and added two sample documents and also wrote in which one should match – JVK Jun 17 '16 at 02:38

1 Answers1

1

Using nested query should allow you to achieve the above. Use inner-hits to get the availability-block that matched. Below is an example to implement this:

Create Index

put testindex
{
    "mappings": {
        "data" : {
            "properties": {
                "availability" : {
                    "type": "nested"
                }
            }
        }
    }
}

Index Data:

put testindex/data/1
{ 

  "first_available_date": "2016-06-21",
  "availability": [
    {
      "start": "2016-06-21",
      "end": "2016-08-01"
    },
    {
      "start": "2016-08-20",
      "end": "2016-08-28"
    },
    {
      "start": "2016-10-03",
      "end": "2016-11-02"
    },
    { 
      "start": "2016-11-13",
      "end": "2016-11-13"
    },
    { 
      "start": "2016-11-28",
      "end": "2017-01-14"
    },
     {
        "start": "2016-07-21",
        "end": "2016-12-19"
      }
  ],
  "apartment_metadata1": 4234,
  "apartment_metadata2": 687878,
  "status": "active"
}

Query:

post testindex/data/_search
{
   "query": {
      "bool": {
         "must": [
            {
               "range": {
                  "first_available_date": {
                     "lte": "2016-08-20"
                  }
               }
            },
            {
               "match": {
                  "status": "active"
               }
            }
         ],
         "filter": [
            {
               "nested": {
                  "path": "availability",
                  "query": {
                     "bool": {
                        "must": [
                           {
                              "range": {
                                 "availability.start": {
                                    "lte": "2016-08-20"
                                 }
                              }
                           },
                           {
                              "range": {
                                 "availability.end": {
                                    "gte": "2016-12-12"
                                 }
                              }
                           }
                        ]
                     }
                  },
                  "inner_hits": {}
               }
            }
         ]
      }
   }
}

Results:

"hits": {
      "total": 1,
      "max_score": 1.4142135,
      "hits": [
         {
            "_index": "testindex",
            "_type": "data",
            "_id": "1",
            "_score": 1.4142135,
            "_source": {
               "first_available_date": "2016-06-21",
               "availability": [
                  {
                     "start": "2016-06-21",
                     "end": "2016-08-01"
                  },
                  {
                     "start": "2016-08-20",
                     "end": "2016-08-28"
                  },
                  {
                     "start": "2016-10-03",
                     "end": "2016-11-02"
                  },
                  {
                     "start": "2016-11-13",
                     "end": "2016-11-13"
                  },
                  {
                     "start": "2016-11-28",
                     "end": "2017-01-14"
                  },
                  {
                     "start": "2016-07-21",
                     "end": "2016-12-19"
                  }
               ],
               "apartment_metadata1": 4234,
               "apartment_metadata2": 687878,
               "status": "active"
            },
            "inner_hits": {
               "availability": {
                  "hits": {
                     "total": 1,
                     "max_score": 1.4142135,
                     "hits": [
                        {
                           "_index": "testindex",
                           "_type": "data",
                           "_id": "1",
                           "_nested": {
                              "field": "availability",
                              "offset": 5
                           },
                           "_score": 1.4142135,
                           "_source": {
                              "start": "2016-07-21",
                              "end": "2016-12-19"
                           }
                        }
                     ]
                  }
               }
            }
         }
      ]
   }
keety
  • 17,231
  • 4
  • 51
  • 56
  • Thank you so much @keety for your solution. My question - using "inner_hits" may return more than one hits object that match the search criteria..right? or it will always return one object? If it may return more than one, then thats not going to help me..right? – JVK Jun 17 '16 at 05:17
  • I am upvoting your solution though :) – JVK Jun 17 '16 at 05:30
  • appreciate the upvote , i'm just curious is it even possible for a document to have more than one availabiltiy object satisifying the crieteria ? Could you given an example where the above query would return more than one object ? – keety Jun 17 '16 at 14:05