I have lat/lon coordinates in a 400 million rows partitioned mysql table. The table grows @ 2000 records a minute and old data is flushed every few weeks. I am exploring ways to do spatial analysis of this data as it comes in.
Most of the analysis requires finding whether a point is in a particular lat/lon polygon or which polygons contain that point.
I see the following ways of tackling the point in polygon (PIP) problem:
Create a mysql function that takes a point and a Geometry and returns a boolean. Simple but not sure how Geometry can be used to perform operations on lat/lon co-ordinates since Geometry assumes flat surfaces and not spheres.
Create a mysql function that takes a point and identifier of a custom data structure and returns a boolean. The polygon vertices can be stored in a table and a function can compute PIP using spherical math. Large number of polygon points may lead to a huge table and slow queries.
Leave point data in mysql and store polygon data in PostGIS and use the app server to run PIP query in PostGIS by probviding point as a parameter.
Port the application from MySQL to Postgresql/PostGIS. This will require a lot of effort in rewriting queries and procedures. I can still do it but how good is Postgresql at handling 400 million rows. A quick search on google for "mysql 1 billion rows" returns many results. same query for Postgres returns no relevant results.
Would like to hear some thoughts & suggestions.