0

I have a table Postcode which holds all UK postcode (approx 1.8m i think)

CREATE TABLE `Postcode` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Postcode` varchar(8) DEFAULT NULL,
  `Postcode_Simple` varchar(8) DEFAULT NULL,
  `Positional_Quality_Indicator` int(11) DEFAULT NULL,
  `Eastings` int(11) DEFAULT NULL,
  `Northings` int(11) DEFAULT NULL,
  `Latitude` double DEFAULT NULL,
  `Longitude` double DEFAULT NULL,
  `LatLong` point DEFAULT NULL,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `Postcode` (`Postcode`),
  KEY `Postcode_Simple` (`Postcode_Simple`),
  KEY `LatLong` (`LatLong`(25))
) ENGINE=InnoDB AUTO_INCREMENT=1755933 DEFAULT CHARSET=latin1;

What I want to achieve is...Given a co-ordinate, locate the postcode nearest to the co-ordinate. Problem is I'm having a bit of an issue with the query (actually in a stored procedure) I've written to do this. The query is:

SELECT
    Postcode
FROM
    (SELECT
        Postcode,
        GLENGTH(
            LINESTRINGFROMWKB(
                LINESTRING(
                    LatLong, 
                    GEOMFROMTEXT(CONCAT('POINT(', varLatitude, ' ', varLongitude, ')'))
                )
            )
        ) AS distance
    FROM
        Postcode
    WHERE
        NOT LatLong IS NULL) P
ORDER BY
    Distance
LIMIT
    1;

The problem I'm having is that the query takes some 12seconds to run and I cannot have it take that long to get a result. Can anyone think of any ways I can reliably speed this query up?

(Here's the explain for the query)

id select_type table      type possible_keys key    key_len ref    rows    Extra
1  PRIMARY     <derived2> ALL  (NULL)        (NULL) (NULL)  (NULL) 1688034 Using filesort
2  DERIVED     Postcode   ALL  LatLong       (NULL) (NULL)  (NULL) 1717998 Using where

I've been trying to think of a way to narrow down the initial amount of data that I must perform the distance calculation on, but I haven't been able to come up with anything that doesn't restrict to finding postcodes within a given distance.

JohnHenry
  • 505
  • 8
  • 21
  • did you create index, PKs, what is the table structure? to speed up the query – jcho360 Oct 17 '12 at 12:58
  • @jcho360 Index's and primary keys are already created but don't help a create deal for the query i'm using unfortunately. The table structure is already shown above... – JohnHenry Oct 17 '12 at 13:18

2 Answers2

1

Maybe try something along the lines of:

SELECT Postcode, lat, lon
FROM
(
SELECT Postcode, MAX(latitude) AS lat, MAX(longitude) AS lon
FROM PostCode

-- field name
GROUP BY Postcode 

HAVING MAX(latitude)<varLatitude AND MAX(longitude)<varLongitude

LIMIT 1
) AS temp

which will basically bring the postcode whose lat and lon are less than the ones you specify but greater than any other lat/lon combination that is less than your vars; so effectively the closest lat/lon to your vars, hence the closest postcode. You can try the same using MIN and greater then instead to go the other way round.

The above will only get you a single result/postcode. If you're looking to have something niftier with like finding a group of postcodes given in a specific radius of lat/long then you should have a look at the formula explained at https://developers.google.com/maps/articles/phpsqlsearch_v3#findnearsql

gts
  • 627
  • 5
  • 13
  • This seems to get me the correct co-ordinates, but the postcode returned in the sub-query is not the postcode that matches the returned co-ordinates. This would be down to the aggregate functions, so I would have to get the postcode using the co-ordinates returned, which means they then need to be indexed too. Im going to investigate this solution more though, as it does narrow down the data very quickly. – JohnHenry Oct 17 '12 at 13:21
  • @JohnHenry sorry my mistake, HAVING should be used instead of WHERE, made the update as needed in the answer – gts Oct 17 '12 at 13:25
  • the difference between the two is that although similar, HAVING will return records based on the aggregate functions, whereas WHERE will not take them into account. Also if you add GROUP BY postcode and a LIMIT you should be able to get a list of closest postcodes – gts Oct 17 '12 at 13:29
  • 1
    Hmm, I can't seem to get this method to work at all, it seems to be returning an awful lot of unwanted and inaccurate results. – JohnHenry Oct 17 '12 at 13:50
  • mmm, I don't have a mysql instance to test at the moment but my last idea would be to add the WHERE condition(latitude – gts Oct 17 '12 at 14:01
  • 1
    Alas that does not appear to work either. I shall see what else I can come up with. Thank you ever so much for all of your help, it's greatly appreciated!!! – JohnHenry Oct 17 '12 at 14:08
0

I've written a tutorial on pretty much exactly what you're after.

Basically, you're on the right lines. In order to improve the efficiency of the search, you'll need to reduce the number of GLength() calculations made by making use of a spatial index on your LatLong field. If you restrict the search to a refined area, such as polygon 10 miles around the point you're comparing the postcodes to, you'll find the query is much quicker.