25

I am trying to solve the problem of finding the n nearest neighbors using PostGIS:

Starting Point:

  • Table geoname with geonames (from geonames.org) containing latitude/longitude (WSG-84)
  • Added a GeometryColumn geom with srid=4326 and datatype=POINT
  • Filled geom with values: UPDATE geoname SET geom = ST_SetSRID(ST_Point(longitude,latitude), 4326);
  • Created GIST index for geom (CREATE INDEX geom_index ON geoname USING GIST (geom);) / Clustered geom_index: CLUSTER geom_index ON geoname;)
  • Created PRIMARY KEY UNIQUE BTREE index for geonameid

Problem: Find n (e.g. 5) nearest neighbors for a given Point in table geoname represented by id (geoname.geonameid.

Possible solution:

Inspired by http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_nearest_neighbor, I tried the following query:

"SELECT start.asciiname, ende.asciiname, distance_sphere(start.geom, ende.geom) as distance " +
"FROM geoname As start, geoname As ende WHERE start.geonameid = 2950159 AND start.geonameid <> ende.geonameid " +
"AND ST_DWithin(start.geom, ende.geom, 300) order by distance limit 5"

Processing time: about 60s

Also tried an approach based on EXPAND:

"SELECT start.asciiname, ende.asciiname, distance_sphere(start.geom, ende.geom) as distance " +
"FROM geoname As start, geoname As ende WHERE start.geonameid = 2950159 AND start.geonameid <> ende.geonameid AND expand(start.geom, 300) && ende.geom " +
"order by distance limit 5"

Processing time: about 120s

The intended application is some kind of autocomplete. So, any approach taking longer than >1s is not applicable. Is it generally possible to achieve a response time of <1s with PostGIS?

Scholle
  • 1,521
  • 2
  • 23
  • 44

2 Answers2

53

Now since PostGIS 2.0, there's a KNN index for geometry types available. This gives you nearest 5 records with regard to how far they are away from "your location...".

SELECT *
FROM your_table 
ORDER BY your_table.geom <-> "your location..."
LIMIT 5;

See <-> operator in PostgreSQL manual.

Julien
  • 953
  • 9
  • 15
Stefan
  • 1,036
  • 1
  • 10
  • 14
  • Neat! One of the many goodies coming with 2.0 release. I presume that will give you results faster? Do you know what (approximately) kind of speed up can you expect? – radek Apr 20 '12 at 10:34
  • 3
    Just to clarify, the KNN index is actually implemented in PG >9.1, so make sure you have the proper version of PG installed... it's definitely worth checking it out... – Scholle Apr 20 '12 at 10:53
  • Also marked this as the answer because it's probably the most efficient way of doing a nearest neighbor search using common db technology... – Scholle Apr 20 '12 at 10:55
  • i tried this, but i get this error: operator does not exist: geography <-> – jacktrade Apr 01 '14 at 14:40
  • @jipipayo Make sure u have PostGIS version >2.0.0 properly installed – Scholle Jun 22 '14 at 10:28
  • 1
    @Paulo Casaretto i assume by "geographic points" you mean objects described by longitude/latitude, yes that's possible! – Scholle Jun 22 '14 at 10:29
  • How it can be implemented for geographic(lat/lng)? I use SRID 4326, and get approximately results((( – FallDi Jul 01 '14 at 17:08
  • 1
    This appears to work with geography when casting it to geometry: `your_table.geography_field::geometry <-> another_table.geography_field::geometry` – Gloopy Aug 13 '14 at 23:08
  • Hello, is the calculation when casting to geometry accurate from your experience? Thanks a lot – jhagege Nov 25 '14 at 22:21
  • I was ordering by distance before finding about this solution. My request performances went from 200ms to 1.5ms. – Theo Jul 16 '20 at 07:37
7

As I think you were answered at the list the unit is in degrees so you area almost searching the whole world with 300 degrees in st_dwithin.

If your dataset is that big so you can't work in a projected meterbased projection instead (much faster and less cpu-intensive calculations) you should consider using the geograpphy type instead. Then you can use st_dwithin with meter.

The make things faster you should I would just create a new table with the geometry converted to geography.

But to just test it you can cast on the fly:

SELECT start.asciiname, ende.asciiname, 
ST_Distance(start.geom::geography, ende.geom::geography) as distance 
FROM geoname As start, geoname As ende 
WHERE start.geonameid = 2950159 AND start.geonameid <> ende.geonameid AND
ST_DWithin(start.geom::geography, ende.geom::geography, 300) 
order by distance 
limit 5;

HTH Nicklas

Nicklas Avén
  • 4,706
  • 1
  • 18
  • 15