1

I am working with ElasticSearch for an application which deals with "posts". I currently have it working with a geo_point so that it will return all posts ordered by distance from the end-user. While this is working I also need to work in one more aspect for the system.

Posts can be paid for and for instance if I were to pay for my post and choose "Local" as the area range then this post should only show to end-users which are less than or equal to 20 miles away.

I have a column on my index named spotlight_range, is there a way I can create a query to say ignore all records if the spotlight_range = 'Local' and the distance is > 20 miles? I need to do this for several different spotlight ranges. For instance Regional may be 100 miles or less, etc.

My current query looks like this

$params = [
    'index' => 'my_index',
    'type' => 'posts',
    'size' => 25,
    'from' => 0,
    'body' => [
        'sort' => [
            '_geo_distance' => [
                'post_location' => [
                    'lat' => '44.4759',
                    'lon' => '-73.2121'
                ],
                'order' => 'asc',
                'unit' => 'mi'
            ]
        ],
        'query' => [
            'filtered' => [
                'query' => [
                    'match_all' => []
                ],
                'filter' => [
                    'geo_distance' => [
                        'distance' => '100mi',
                        'post_location' => [
                            'lat' => '44.4759',
                            'lon' => '-73.2121'
                        ]
                    ]
                ]
            ]
        ]
    ]
];

My index is setup with the following fields.

'id' => ['type' => 'integer'],
'title' => ['type' => 'string'],
'description' => ['type' => 'string'],
'price' => ['type' => 'integer'],
'shippable' => ['type' => 'boolean'],
'username' => ['type' => 'string'],
'post_location' => ['type' => 'geo_point'],
'post_location_string' => ['type' => 'string'],
'is_spotlight' => ['type' => 'boolean'],
'spotlight_range' => ['type' => 'string'],
'created_at' => ['type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss'],
'updated_at' => ['type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss']

My end goal for this is not specifically to search for distance < X and range = Y but rather to have it filter them out for all types based on distances I specify. The search should return ALL types of ranges but also filter out anything past my specified distance for each range type based on the users lat/lon passed into the query.

I have been looking for a solution to this online without much luck.

Joseph Crawford
  • 1,470
  • 1
  • 15
  • 29
  • can you please add more example to explain your use case? – user3775217 Jan 30 '17 at 18:21
  • I do not have anything more that I can add as an example but I can try to explain better. Currently we have the values for these strings in configuration files so that they can easily be updated. Local = 20 miles, Regional = 200 miles, national = 3000 miles and worldwide = 250000 miles. If for instance i query with a users lat/lon any post which is > 20 miles and marked Local would not show for that particular user, I would want them filtered out from the search results. I know I could do it post querying ES but I do not think that is optimal to get records i will need to filter out. – Joseph Crawford Jan 30 '17 at 19:27
  • The same would go for Regional, any post that is > 200 miles from the users lat/lon location should be filtered out of the results. I am just not sure how to translate this into an ES query and I am not finding much on this topic on Google or in any ES book I have looked through. – Joseph Crawford Jan 30 '17 at 19:29

1 Answers1

0

I would add a circle geo_shape to the document, centered on post_location and with a radius corresponding to the spotlight_range since you know both information at indexing time. That way you can encode into each post its corresponding "reach".

...
'post_location' => ['type' => 'geo_point'],
'spotlight_range' => ['type' => 'string'],
'reach' => ['type' => 'geo_shape'],            <---- add this

So a "local" document would look something like this once indexed

{
    "spotlight_range": "local",
    "post_location": { 
         "lat": 42.1526,
         "lon": -71.7378
    }, 
    "reach" : {
        "type" : "circle",
        "coordinates" : [-71.7378, 42.1526],
        "radius" : "20mi"
    }
}

Then the query would feature another geo_shape centered on the user's location with the chosen radius and would only retrieve documents whose reach intersects the circle shape in the query.

$params = [
    'index' => 'my_index',
    'type' => 'posts',
    'size' => 25,
    'from' => 0,
    'body' => [
        'sort' => [
            '_geo_distance' => [
                'post_location' => [
                    'lat' => '44.4759',
                    'lon' => '-73.2121'
                ],
                'order' => 'asc',
                'unit' => 'mi'
            ]
        ],
        'query' => [
            'filtered' => [
                'query' => [
                    'match_all' => []
                ],
                'filter' => [
                    'geo_shape' => [
                        'reach' => [
                           'relation' => 'INTERSECTS',
                           'shape' => [
                               'type' => 'circle',
                               'coordinates' => [-73.2121, 44.4759],
                               'radius' => '20mi'
                           ]
                        ]
                    ]
                ]
            ]
        ]
    ]
];
Val
  • 207,596
  • 13
  • 358
  • 360
  • In your example the actual query would be using the users lat/lon and not the post lat/lon is that correct? I will give this example a shot in a bit, and if it works I will accept your answer. – Joseph Crawford Feb 03 '17 at 16:09
  • Yes, in the query you use the user's lat/lon and that will be matched against the shape centered on each post's lat/lon – Val Feb 03 '17 at 17:21
  • Looking over this I am not sure if this will work or not but I have not yet attempted it. However Local maybe limited to 20 miles but Regional would be limited to 200 miles, then National, and World. World obviously would have no limit but I would need to filter out over-limits for Local, National, Regional and not just a single distance limit... – Joseph Crawford Feb 06 '17 at 17:55
  • As far as I understand your use case, this should work out. Let me know – Val Feb 06 '17 at 19:12
  • So, no luck with this? – Val Feb 09 '17 at 05:43
  • Sorry i have not yet had time to try this, I am going to "assume" it will work and award you the bounty before it timesout and goes to someone else. I will update you as soon as I get this back on my plate of things to fix :D – Joseph Crawford Feb 09 '17 at 21:13
  • We'll get this figured out no matter what – Val Feb 09 '17 at 21:13
  • Val I attempted to execute this query today but it was just returning an empty result set. Sorry for such a delay but we have quite the workload to complete and now this is our last major feature before we go live. I created another question that may contain a bit more clarity. http://stackoverflow.com/questions/43245610/elasticsearch-multiple-phrases-multiple-distances Any assistance would be greatly appreciated. I spent hours going through the documentation trying to figure out how to do this today without any luck. – Joseph Crawford Apr 06 '17 at 04:20
  • I forgot to mention we are using ElasticSearch 2.3 not sure if that makes a difference or not. – Joseph Crawford May 12 '17 at 15:49
  • Would this not working have anything to do with this statement from the docs? The geo_shape filter has been replaced by the GeoShape Query. It behaves as a query in “query context” and as a filter in “filter context” (see Query DSL). – Joseph Crawford May 12 '17 at 15:57
  • No it's only because you're still using the old 1.x DSL syntax instead of 2.x – Val May 12 '17 at 15:59
  • The only part i am confused about is the query where you use the users location data. I am confused because I want to pull all Local, Regional, National, Worldwide posts that are spotlighted. In the query you limit it to 20 miles. Even if I were to run 4 queries each with their own distance the worldwide would pick up the local even if it were not local to the user. Any thoughts on that? I am working on refactoring the code to create the index with the geo_shape on posts and will work on the query next but that bit throws me off. – Joseph Crawford May 12 '17 at 17:35
  • Val any idea on my latest comment/question? – Joseph Crawford May 16 '17 at 14:04
  • I'll get back to you shortly. – Val May 16 '17 at 14:22
  • Thank You Much :) – Joseph Crawford May 16 '17 at 14:29
  • Any thoughts on that 20 mile radius? I have the code in place but it is still showing posts that should not show because the spotlight distance is shorter than the distance between the user. A NYC user is still seeing locally spotlighted posts in Southern Vermont and local spotlight is 20 miles. – Joseph Crawford May 17 '17 at 14:37
  • So I have this in place and I thought it was working, however I have this one post that is not showing up in my results and I am not quite sure why. Is there a way to debug the ElasticSearch in a client such as Paw or PostMan? "spotlight_expiration": "2017-06-23 17:31:50", "spotlight_range": "Worldwide", "reach": { "coordinates": [ -117.1889, 33.7415 ], "type": "circle", "radius": 250000 } – Joseph Crawford May 23 '17 at 18:15
  • I believe i figured out the issue, I didn't have units on the radius. Adding 'mi' to the radius solved the issue. Thank you so much for your assistance on this Val. – Joseph Crawford May 23 '17 at 19:36
  • I'm glad you figured it out. – Val May 23 '17 at 20:53