1

I am creating an appEngine application in python that will need to perform efficient geospatial queries on datastore data. An example use case would be, I need to find the first 20 posts within a 10 mile radius of the current user. Having done some research into my options, I have found that currently what seems like the 2 best approaches for achieving this type of functionality would be:

  • Indexing geoHashed geopoint data using Python's GeoModel library
  • Creating/deleting documents of structured data using Google's newer SearchAPI

It seems from a high level perspective that indexing geohashes and performing queries on them directly would be less costly and much faster than having to create and delete a document for every geospatial query, however i've also read that geohashing can be very inaccurate along the equator or along 'faultlines' created by the hashing algorithm. I've seen very few posts contrasting the best methods in detail, and I think stack is a good place to have this conversation, so my questions are as follows:

  • Has anyone implemented similar features and had positive experiences with either methods?
  • Which method would be the cheaper alternative?
  • Which would be the faster alternative?
  • Is there another important method I'm leaving out?

Thanks in advance.

Dan McGrath
  • 41,220
  • 11
  • 99
  • 130
BuddyD
  • 25
  • 6

2 Answers2

1

Geohashing does not have to be inaccurate at all. It's all in the implementation details. What I mean is you can check the neighbouring geocells as well to handle border-cases, and make sure that includes neighbours on the other side of the equator.

If your use case is finding other entities within a radius as you suggest, I would definitely recommend using the Search API. They have a distance function tailored for that use.

Search API queries are more expensive than Datastore queries yes, but if you weigh in the computation time to do these calculations in your instance and probably iterating through all entities for each geohash to make sure the distance is actually less than the desired radius, then I would say Search API is the winner. And don't forget about the implementation time.

marcus
  • 86
  • 1
  • 2
  • Are you saying the Search API does not use any instances if you are on a standard google app engine environment? – Micro Dec 28 '16 at 20:56
  • 1
    The Search API is a service supplied by the Google Platform. It is not run inside your instance - you send a call and wait for a response from the service. You must have an instance running to be able to access the Search API, but the searching itself is not done in your instance. You will however use up a thread in your instance while waiting for the response. If you were to do the calculations in your instance you might want to use multiple threads and/or a more expensive CPU for your instance. If you are just passing along data you can instead have a much cheaper 500MHz instance. – marcus Jan 04 '17 at 23:49
-1

You can have a look at this post, it can be another great alternative.

I have used this within my app and it works great for my requirement to find my app users with-in provided radius .

rahulfhp
  • 3
  • 7