4

My trival query take 3 seconds to return and requires a ton of reads according to SQL Profiler. Why?

I have a table filled with 5,000,000 accounts that are all geocoded points. All of the accounts are clustered within a 20 mile radius of a city. My index looks like so.

CREATE SPATIAL INDEX [IX_CI_Geocode] ON [dbo].[CustomerInformation] 
(
    [Geocode]
)USING  GEOGRAPHY_GRID 
WITH (
GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = LOW), 
CELLS_PER_OBJECT = 128, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

When I run a query as simple as the following:

DECLARE @g geography = geography::Point(41.848039, -87.96361, 4326);
DECLARE @region geography = @g.STBuffer(5000);

select count(0) from CustomerInformation ci WITH(INDEX(IX_CI_Geocode))
where ci.Geocode.STIntersects(@region) = 1

It takes 3 seconds to return and according to SQL Server Profiler it required CPU of 12,203 and Reads of 1,218,873. Those seem like huge numbers for using an index.

Why is this so slow? Why does this require reading from the hard drive so much? What can I do to improve the performance of this?

Looking at the query plan the Filter operator in the screenshot below is 34% of the cost of the query.

enter image description here

The "Clustered Index Seek" operator is 63% of the query.

enter image description here

Paul Mendoza
  • 5,709
  • 12
  • 53
  • 82
  • I don't have any experience with spatial indexes, but something that's jumping out at me is your estimated vs actual row counts. Normally this is a sign of out-of-date statistics. Have you tried updating stats on the table and indexes? – JNK Aug 04 '11 at 19:14
  • Good idea but that didn't end up helping at all. I ended up using another approach. – Paul Mendoza Aug 04 '11 at 20:15

2 Answers2

1

My eventual solution ended up being to use Filter instead. It gives back a lot of false positives but it was turning out to be 3X faster in terms of performance. After I get the result set then I apply a distance function to remove the ones I don't care about and that seems to be fast.

The first select query takes 1 second on 5 million accounts. The second takes 3 seconds.

DECLARE @g geography = geography::Point(41.848039, -87.96361, 4326);
DECLARE @region geography = @g.STBuffer(5000);

select count(0) from CustomerInformation ci WITH(INDEX(IX_CI_Geocode))
where ci.Geocode.Filter(@region) = 1


select count(0) from CustomerInformation ci WITH(INDEX(IX_CI_Geocode))
where ci.Geocode.STIntersects(@region) = 1
Kekoa
  • 27,892
  • 14
  • 72
  • 91
Paul Mendoza
  • 5,709
  • 12
  • 53
  • 82
0

Based on my math skills it feels like extra work if you doing STBuffer() first if you are not really interested in it.

Could you try the following for me and report results?

DECLARE @g geography = geography::Point(41.848039, -87.96361, 4326);

select count(0) from CustomerInformation ci WITH(INDEX(IX_CI_Geocode))
where ci.Geocode.STDistance(@g) <= 5000

On the other hand, is there a way you could provide the database, so I can test it for myself?

fabsenet
  • 372
  • 2
  • 15