0

The following query performs very well at higher zoom levels (bounding box smaller than 2 degrees longitude by 0.5 degrees latitude, but degrades significantly as the bounding box gets larger. The table contains 7~8 million rows of text and location data stored as points in a geometry column.

I have tried different grid configurations, but the performance always degrades as the bounding box gets larger than something like @north=41.123029000000002, @east=-72.935406, @south=40.296503999999999, @west=-75.077740000000006

Any ideas? Thanks ~ Matt

I have included a table of the performance at different zoom levels.

declare @filter geometry
select @filter = GEOMETRY::STGeomFromText(
                            'LINESTRING(' + CONVERT(varchar,@west) + ' ' + CONVERT(varchar,@south) + ',' + CONVERT(varchar,@east) + ' ' + CONVERT(varchar,@north) + ')'
                            ,4326
                            ).STEnvelope();

select
    x.Id
    ,x.[Timestamp]
    ,x.Location
    ,x.[Text]
from (
    select top(1000)
        t.Id
        ,t.[Timestamp]
        ,t.Location
        ,t.[Text]
    from dbo.Table1 AS t with (nolock, index([SPIX_Table1_Location_HIGH]))
    inner join containstable(
                dbo.Table1
                ,[text]
                , N'FORMSOF(INFLECTIONAL, "word1") | FORMSOF(INFLECTIONAL, "word2") | FORMSOF(INFLECTIONAL, "wordN")'
        ) as r on t.Id = r.[KEY]
    where t.Location.Filter(@filter) = 1
        AND t.[Hour] >= @HourId
    order by r.[RANK] desc
) as x
order by x.[Timestamp] desc
option (maxdop 1)

Zoom Level 2 = @north=74.542308000000006,@east=94.21875,@south=-24.370607,@west=-180 Zoom Level 20 = @north=40.711250999999997,@east=-74.006050000000002,@south=40.710847000000001,@west=-74.007096000000004

This is the spatial index:

CREATE SPATIAL INDEX [SPIX_Table1_location_HIGH] ON [dbo].[Table1] 
(
    [location]
)USING  GEOMETRY_GRID 
WITH (
BOUNDING_BOX =(-180, -90, 180, 90), GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH), 
CELLS_PER_OBJECT = 16, FILLFACTOR = 70) ON [PRIMARY]
GO

Request Duration by Zoom Level (1000 random tests):

Level   Duration0to5    Duration5to10   Duration10to15  Duration15to20  DurationGreaterThan20
2   0   0   0   0   26
3   0   0   0   0   42
4   0   0   0   0   57
5   0   0   0   0   60
6   0   0   0   0   54
7   0   0   0   1   65
8   0   2   5   6   34
9   0   3   7   10  6
10  5   23  25  14  1
11  13  26  18  3   0
12  17  31  7   0   0
13  48  11  0   0   0
14  48  6   0   0   0
15  47  1   0   0   0
16  57  0   0   0   0
17  48  8   0   0   0
18  44  3   0   0   0
19  63  5   0   0   0
20  47  3   0   0   0
ALL 437 122 62  34  345

Row counts and exec times for a typical query, w/out top(1000), forcing spatial index:

if (@level=2)
select @north=74.542308000000006,@east=94.21875,@south=-24.370607,@west=-180 --70,404 rows, 2 minutes w/ spatial
if (@level=3)
select @north=61.978465999999997,@east=-5.451886,@south=9.7052770000000006,@west=-142.56126 -- 57,911 rows, 1m22s w/ spatial
if (@level=4)
select @north=52.614061999999997,@east=-39.729228999999997,@south=26.230861999999998,@west=-108.283917 -- 45,636 rows, 1m23s w/ spatial
if (@level=5)
select @north=46.992624999999997,@east=-56.867901000000003,@south=33.775959999999998,@west=-91.145245000000003 -- 32,386 rows, 26s w/ spatial
if (@level=6)
select @north=43.934699999999999,@east=-65.437236999999996,@south=37.323439,@west=-82.575908999999996 -- 19,998 rows, 13s w/ spatial
if (@level=7)
select @north=42.343530999999999,@east=-69.721905000000007,@south=39.037540999999997,@west=-78.291240999999999 -- 11,256 rows, 13s w/ spatial
if (@level=8)
select @north=41.532438999999997,@east=-71.864238999999998,@south=39.879399999999997,@west=-76.148906999999994 -- 6,147 rows, 4s w/ spatial
if (@level=9)
select @north=41.123029000000002,@east=-72.935406,@south=40.296503999999999,@west=-75.077740000000006 -- 3,667 rows, 3s w/ spatial
Matt Morgan
  • 360
  • 3
  • 9
  • Is there a particular reason for using Geometry over Geography when you're clearly storing lat/long coordinates? Also have you analysed the Index to see how efficient it is (primary and secondary filters)? – Jon Bellamy Feb 19 '14 at 15:33
  • Thanks for the response. I am using geometry out of ignorance apparently. I ran sp_help_spatial_geometry_index on a bounding box like: @north=42.343530999999999, @east=-69.721905000000007, @south=39.037540999999997, @west=-78.291240999999999 which corresponds to zoom level 7 from the table above. Unfortunately, I don't have the output data, but the efficiency values seemed good to me (high 80s or low 90s), but I didn't really know what to look at it. I can run it again and post the values. – Matt Morgan Feb 21 '14 at 14:00
  • OK, efficiencies on the index sound good if they're that high. Are the times you've posted above in seconds OR milliseconds? – Jon Bellamy Feb 21 '14 at 14:19
  • Also, two more things. Firstly, have you confirmed the speed issue is the Geometry selection and not your FULL TEXT search? Secondly, try replacing `where t.Location.Filter(@filter) = 1` with `where t.Location.STWithin(@filter) = 1`. – Jon Bellamy Feb 21 '14 at 14:23
  • The times above (Duration0to5 for example) are in seconds. The performance of the full-text search is always very fast (less than 1 second to return up to 130K rows). – Matt Morgan Feb 21 '14 at 15:29
  • Well at least we can rule that out! :-) What about performance with STWithin rather than Filter. Also, I understand the zoom level part, but what is Duration0to5, Duration5to10 etc. in your table? Can I also clarify that the times in second are the average over 1,000 tests, or the total in seconds for 1,000 tests? – Jon Bellamy Feb 21 '14 at 15:51
  • I haven't tested STWithin yet, I am just now testing geography instead of geometry. The values in the table above are counts grouped by zoom level where execution time falls within the given range. So after I run the tests I sum them like this: select [Level], count(1) from dbo.Results where [DurationSeconds] between 0 and 5 ... and so on for each of the range columns – Matt Morgan Feb 22 '14 at 15:20
  • You may find Geography performs slightly slower. Understand perfectly on the timing now thanks - over 20 seconds is a lot of time to wait - however, typically at zoom levels 2 through 6, how many rows do you get when there are over 7m rows in the database (assuming you didn't select the TOP 1000)? Without any other type of filter, your bounding box specified at zoom level 2 covers almost 6% of the entire globe. – Jon Bellamy Feb 23 '14 at 15:33
  • Hi Jon... I posted row counts in an edit to the question. Thanks for the suggestions. – Matt Morgan Feb 26 '14 at 15:18
  • Makes a lot of sense in some ways - but not in others. I'm not entirely sure what it is yet, but something isn't right. I have a db with 2m points AND 300k complex polygons - it takes me 23s to return 1,257,105 rows and 0.7s to return 23,010 rows using an STWithin filter. I've also tried with Filter and the timings are the same. If your FULLTEXT search runs as fast as you say, then it leaves the following possible causes; 1. Your Spatial Index, 2. your join with containstable, 3. Your where clauses OR 4. Your orderby clauses. – Jon Bellamy Feb 26 '14 at 16:02
  • with regards to 1. Have you played around with your Spatial Index settings - for example, try High, Low, Low, Low - I've often found that good with points. Your spatial index could be too large to be as efficient as you need it. for 2. Remove the full text search and join (just to see what the difference is). For 3, Make sure table1 has an index on the Hour column. For 4, put a index on the TimeStamp column (go descending order here). Lastly, have you tried times without `OPTION (MAXDOP 1)`. Try each of the above, in turn, recording the time difference - you can keep top 1000 on – Jon Bellamy Feb 26 '14 at 16:07
  • Also, if you can post up an execution plan, please do so. – Jon Bellamy Feb 26 '14 at 16:08

0 Answers0