0

I'm trying to generate reports that involve overlapping geographical boundaries.

As a simplified example, zip codes inside a DMA (designated marketing area).

I have SQL geography data describing the polygon shape of the DMA, and SQL geography data describing the polygon shapes of US zip codes.

I'd like to find out:

  1. Which zip codes are fully contained inside the DMA
  2. Which zip codes partially overlap the DMA.
  3. What percentage of the area of the overlapping zip code is inside the DMA

I'm able to obtain this information using the STWithin and STIntersection methods, comparing the areas. However, I'm running into some heavy performance bottlenecks.

The first thing I've done is calculate and record the outer bounding boxes of the DMAs and zip code shapes, saved them in the table and indexed on that data. This has allowed me to exclude every zip code that cannot possibly overlap with the DMA.

Example:

SELECT  ...
FROM Locations d
INNER JOIN Locations z ON z.Type = 5 -- Zip Code
WHERE d.ID = @DMAID
AND NOT (
   z.MaxLat < d.MinLat
OR z.MinLat > d.MaxLat
OR z.MaxLng < d.MinLng
OR z.MinLng > d.MaxLng
)

However, depending on the quantity of the resulting zip codes to compare, the query can still take far too long to be usable in a production environment.

I'm hoping there may be some way to calculate the "inner bounding box" (for lack of a better word) similar to how STEnvelope calculates the "outer bounding box"

The idea being, what is the largest rectangle that will fit inside the supplied shape? This would allow a quick query of zip code bounding boxes that fully fit inside it, leaving a (hopefully) much smaller set of zip codes that would require the more precise STIntersection.

Edit: Added 5/25/2018

After a discussion below with Rick, I'm including the execution plan details for two queries, one using an AND NOT ( > OR pattern, and one reversed with AND ( <= AND.

SELECT ID
FROM Locations
WHERE LocationType = 5
AND CountryCode = 'US'
AND NOT (
   MaxLat < 32.750
OR MinLat > 34.823
OR MaxLng < -118.951
OR MinLng > -117.646
)

SELECT ID
FROM Locations
WHERE LocationType = 5
AND CountryCode = 'US'
AND MaxLat >= 32.750
AND MinLat <= 34.823
AND MaxLng >= -118.951
AND MinLng <= -117.646

Here is the XML view of the execution plans:

<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.5" Build="13.0.4001.0">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementText="SELECT ID&#xd;&#xa;FROM Locations&#xd;&#xa;WHERE LocationType = 5&#xd;&#xa;AND CountryCode = &apos;US&apos;&#xd;&#xa;AND NOT (&#xd;&#xa;   MaxLat &lt; 32.750&#xd;&#xa;OR MinLat &gt; 34.823&#xd;&#xa;OR MaxLng &lt; -118.951&#xd;&#xa;OR MinLng &gt; -117.646&#xd;&#xa;)" StatementId="1" StatementCompId="1" StatementType="SELECT" RetrievedFromCache="false" StatementSubTreeCost="0.0226731" StatementEstRows="88.5595" SecurityPolicyApplied="false" StatementOptmLevel="TRIVIAL" QueryHash="0xA63252C178F68E1" QueryPlanHash="0x48CBE2D413E8362B" CardinalityEstimationModelVersion="70">
          <StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false"></StatementSetOptions>
          <QueryPlan CachedPlanSize="24" CompileTime="3" CompileCPU="3" CompileMemory="344">
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0"></MemoryGrantInfo>
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="2949120" EstimatedPagesCached="5898240" EstimatedAvailableDegreeOfParallelism="8" MaxCompileMemory="635169608"></OptimizerHardwareDependentProperties>
            <RelOp NodeId="0" PhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="88.5595" EstimatedRowsRead="1466.67" EstimateIO="0.0209028" EstimateCPU="0.00177034" AvgRowSize="50" EstimatedTotalSubtreeCost="0.0226731" TableCardinality="86546" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
              <OutputList>
                <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="ID"></ColumnReference>
              </OutputList>
              <NestedLoops Optimized="0">
                <OuterReferences>
                  <ColumnReference Column="Expr1003"></ColumnReference>
                  <ColumnReference Column="Expr1004"></ColumnReference>
                  <ColumnReference Column="Expr1002"></ColumnReference>
                </OuterReferences>
                <RelOp NodeId="1" PhysicalOp="Compute Scalar" LogicalOp="Compute Scalar" EstimateRows="1" EstimateIO="0" EstimateCPU="0" AvgRowSize="37" EstimatedTotalSubtreeCost="0" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
                  <OutputList>
                    <ColumnReference Column="Expr1003"></ColumnReference>
                    <ColumnReference Column="Expr1004"></ColumnReference>
                    <ColumnReference Column="Expr1002"></ColumnReference>
                  </OutputList>
                  <ComputeScalar>
                    <DefinedValues>
                      <DefinedValue>
                        <ValueVector>
                          <ColumnReference Column="Expr1003"></ColumnReference>
                          <ColumnReference Column="Expr1004"></ColumnReference>
                          <ColumnReference Column="Expr1002"></ColumnReference>
                        </ValueVector>
                        <ScalarOperator ScalarString="GetRangeWithMismatchedTypes(NULL,(34.823),(42))">
                          <Intrinsic FunctionName="GetRangeWithMismatchedTypes">
                            <ScalarOperator>
                              <Const ConstValue="NULL"></Const>
                            </ScalarOperator>
                            <ScalarOperator>
                              <Const ConstValue="(34.823)"></Const>
                            </ScalarOperator>
                            <ScalarOperator>
                              <Const ConstValue="(42)"></Const>
                            </ScalarOperator>
                          </Intrinsic>
                        </ScalarOperator>
                      </DefinedValue>
                    </DefinedValues>
                    <RelOp NodeId="2" PhysicalOp="Constant Scan" LogicalOp="Constant Scan" EstimateRows="1" EstimateIO="0" EstimateCPU="0" AvgRowSize="0" EstimatedTotalSubtreeCost="0" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
                      <OutputList></OutputList>
                      <ConstantScan></ConstantScan>
                    </RelOp>
                  </ComputeScalar>
                </RelOp>
                <RelOp NodeId="3" PhysicalOp="Index Seek" LogicalOp="Index Seek" EstimateRows="88.5595" EstimatedRowsRead="1466.67" EstimateIO="0.0209028" EstimateCPU="0.00177034" AvgRowSize="50" EstimatedTotalSubtreeCost="0.0226731" TableCardinality="86546" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
                  <OutputList>
                    <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="ID"></ColumnReference>
                  </OutputList>
                  <IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore">
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="ID"></ColumnReference>
                      </DefinedValue>
                    </DefinedValues>
                    <Object Database="****" Schema="[dbo]" Table="[Locations]" Index="[IX_MinMax]" IndexKind="NonClustered" Storage="RowStore"></Object>
                    <SeekPredicates>
                      <SeekPredicateNew>
                        <SeekKeys>
                          <Prefix ScanType="EQ">
                            <RangeColumns>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="LocationType"></ColumnReference>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="CountryCode"></ColumnReference>
                            </RangeColumns>
                            <RangeExpressions>
                              <ScalarOperator ScalarString="(5)">
                                <Const ConstValue="(5)"></Const>
                              </ScalarOperator>
                              <ScalarOperator ScalarString="N&apos;US&apos;">
                                <Const ConstValue="N&apos;US&apos;"></Const>
                              </ScalarOperator>
                            </RangeExpressions>
                          </Prefix>
                          <StartRange ScanType="GT">
                            <RangeColumns>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MinLat"></ColumnReference>
                            </RangeColumns>
                            <RangeExpressions>
                              <ScalarOperator ScalarString="[Expr1003]">
                                <Identifier>
                                  <ColumnReference Column="Expr1003"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </RangeExpressions>
                          </StartRange>
                          <EndRange ScanType="LT">
                            <RangeColumns>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MinLat"></ColumnReference>
                            </RangeColumns>
                            <RangeExpressions>
                              <ScalarOperator ScalarString="[Expr1004]">
                                <Identifier>
                                  <ColumnReference Column="Expr1004"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </RangeExpressions>
                          </EndRange>
                        </SeekKeys>
                      </SeekPredicateNew>
                    </SeekPredicates>
                    <Predicate>
                      <ScalarOperator ScalarString="****.[dbo].[Locations].[MaxLat]&gt;=(32.750) AND ****.[dbo].[Locations].[MaxLng]&gt;=(-118.951) AND ****.[dbo].[Locations].[MinLng]&lt;=(-117.646)">
                        <Logical Operation="AND">
                          <ScalarOperator>
                            <Compare CompareOp="GE">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MaxLat"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Const ConstValue="(32.750)"></Const>
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                          <ScalarOperator>
                            <Compare CompareOp="GE">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MaxLng"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Const ConstValue="(-118.951)"></Const>
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                          <ScalarOperator>
                            <Compare CompareOp="LE">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MinLng"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Const ConstValue="(-117.646)"></Const>
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                        </Logical>
                      </ScalarOperator>
                    </Predicate>
                  </IndexScan>
                </RelOp>
              </NestedLoops>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
        <StmtSimple StatementText="&#xd;&#xa;&#xd;&#xa;SELECT ID&#xd;&#xa;FROM Locations&#xd;&#xa;WHERE LocationType = 5&#xd;&#xa;AND CountryCode = &apos;US&apos;&#xd;&#xa;AND MaxLat &gt;= 32.750&#xd;&#xa;AND MinLat &lt;= 34.823&#xd;&#xa;AND MaxLng &gt;= -118.951&#xd;&#xa;AND MinLng &lt;= -117.646" StatementId="2" StatementCompId="2" StatementType="SELECT" RetrievedFromCache="false" StatementSubTreeCost="0.0226731" StatementEstRows="88.5595" SecurityPolicyApplied="false" StatementOptmLevel="TRIVIAL" QueryHash="0xA63252C178F68E1" QueryPlanHash="0x48CBE2D413E8362B" CardinalityEstimationModelVersion="70" ParameterizedText="(@1 tinyint,@2 varchar(8000),@3 numeric(5,3),@4 numeric(5,3),@5 numeric(6,3),@6 numeric(6,3))SELECT [ID] FROM [Locations] WHERE [LocationType]=@1 AND [CountryCode]=@2 AND [MaxLat]&gt;=@3 AND [MinLat]&lt;=@4 AND [MaxLng]&gt;=@5 AND [MinLng]&lt;=@6">
          <StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false"></StatementSetOptions>
          <QueryPlan CachedPlanSize="32" CompileTime="2" CompileCPU="2" CompileMemory="336">
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0"></MemoryGrantInfo>
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="2949120" EstimatedPagesCached="5898240" EstimatedAvailableDegreeOfParallelism="8" MaxCompileMemory="635178856"></OptimizerHardwareDependentProperties>
            <RelOp NodeId="0" PhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="88.5595" EstimatedRowsRead="1466.67" EstimateIO="0.0209028" EstimateCPU="0.00177034" AvgRowSize="50" EstimatedTotalSubtreeCost="0.0226731" TableCardinality="86546" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
              <OutputList>
                <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="ID"></ColumnReference>
              </OutputList>
              <NestedLoops Optimized="0">
                <OuterReferences>
                  <ColumnReference Column="Expr1005"></ColumnReference>
                  <ColumnReference Column="Expr1006"></ColumnReference>
                  <ColumnReference Column="Expr1004"></ColumnReference>
                </OuterReferences>
                <RelOp NodeId="1" PhysicalOp="Compute Scalar" LogicalOp="Compute Scalar" EstimateRows="1" EstimateIO="0" EstimateCPU="0" AvgRowSize="37" EstimatedTotalSubtreeCost="0" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
                  <OutputList>
                    <ColumnReference Column="Expr1005"></ColumnReference>
                    <ColumnReference Column="Expr1006"></ColumnReference>
                    <ColumnReference Column="Expr1004"></ColumnReference>
                  </OutputList>
                  <ComputeScalar>
                    <DefinedValues>
                      <DefinedValue>
                        <ValueVector>
                          <ColumnReference Column="Expr1005"></ColumnReference>
                          <ColumnReference Column="Expr1006"></ColumnReference>
                          <ColumnReference Column="Expr1004"></ColumnReference>
                        </ValueVector>
                        <ScalarOperator ScalarString="GetRangeWithMismatchedTypes(NULL,[@4],(42))">
                          <Intrinsic FunctionName="GetRangeWithMismatchedTypes">
                            <ScalarOperator>
                              <Const ConstValue="NULL"></Const>
                            </ScalarOperator>
                            <ScalarOperator>
                              <Identifier>
                                <ColumnReference Column="@4"></ColumnReference>
                              </Identifier>
                            </ScalarOperator>
                            <ScalarOperator>
                              <Const ConstValue="(42)"></Const>
                            </ScalarOperator>
                          </Intrinsic>
                        </ScalarOperator>
                      </DefinedValue>
                    </DefinedValues>
                    <RelOp NodeId="2" PhysicalOp="Constant Scan" LogicalOp="Constant Scan" EstimateRows="1" EstimateIO="0" EstimateCPU="0" AvgRowSize="0" EstimatedTotalSubtreeCost="0" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
                      <OutputList></OutputList>
                      <ConstantScan></ConstantScan>
                    </RelOp>
                  </ComputeScalar>
                </RelOp>
                <RelOp NodeId="3" PhysicalOp="Index Seek" LogicalOp="Index Seek" EstimateRows="88.5595" EstimatedRowsRead="1466.67" EstimateIO="0.0209028" EstimateCPU="0.00177034" AvgRowSize="50" EstimatedTotalSubtreeCost="0.0226731" TableCardinality="86546" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
                  <OutputList>
                    <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="ID"></ColumnReference>
                  </OutputList>
                  <IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore">
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="ID"></ColumnReference>
                      </DefinedValue>
                    </DefinedValues>
                    <Object Database="****" Schema="[dbo]" Table="[Locations]" Index="[IX_MinMax]" IndexKind="NonClustered" Storage="RowStore"></Object>
                    <SeekPredicates>
                      <SeekPredicateNew>
                        <SeekKeys>
                          <Prefix ScanType="EQ">
                            <RangeColumns>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="LocationType"></ColumnReference>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="CountryCode"></ColumnReference>
                            </RangeColumns>
                            <RangeExpressions>
                              <ScalarOperator ScalarString="CONVERT_IMPLICIT(int,[@1],0)">
                                <Identifier>
                                  <ColumnReference Column="ConstExpr1002">
                                    <ScalarOperator>
                                      <Convert DataType="int" Style="0" Implicit="1">
                                        <ScalarOperator>
                                          <Identifier>
                                            <ColumnReference Column="@1"></ColumnReference>
                                          </Identifier>
                                        </ScalarOperator>
                                      </Convert>
                                    </ScalarOperator>
                                  </ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator ScalarString="CONVERT_IMPLICIT(nvarchar(4000),[@2],0)">
                                <Identifier>
                                  <ColumnReference Column="ConstExpr1003">
                                    <ScalarOperator>
                                      <Convert DataType="nvarchar" Length="8000" Style="0" Implicit="1">
                                        <ScalarOperator>
                                          <Identifier>
                                            <ColumnReference Column="@2"></ColumnReference>
                                          </Identifier>
                                        </ScalarOperator>
                                      </Convert>
                                    </ScalarOperator>
                                  </ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </RangeExpressions>
                          </Prefix>
                          <StartRange ScanType="GT">
                            <RangeColumns>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MinLat"></ColumnReference>
                            </RangeColumns>
                            <RangeExpressions>
                              <ScalarOperator ScalarString="[Expr1005]">
                                <Identifier>
                                  <ColumnReference Column="Expr1005"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </RangeExpressions>
                          </StartRange>
                          <EndRange ScanType="LT">
                            <RangeColumns>
                              <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MinLat"></ColumnReference>
                            </RangeColumns>
                            <RangeExpressions>
                              <ScalarOperator ScalarString="[Expr1006]">
                                <Identifier>
                                  <ColumnReference Column="Expr1006"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </RangeExpressions>
                          </EndRange>
                        </SeekKeys>
                      </SeekPredicateNew>
                    </SeekPredicates>
                    <Predicate>
                      <ScalarOperator ScalarString="****.[dbo].[Locations].[MaxLat]&gt;=[@3] AND ****.[dbo].[Locations].[MaxLng]&gt;=[@5] AND ****.[dbo].[Locations].[MinLng]&lt;=[@6]">
                        <Logical Operation="AND">
                          <ScalarOperator>
                            <Compare CompareOp="GE">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MaxLat"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Column="@3"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                          <ScalarOperator>
                            <Compare CompareOp="GE">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MaxLng"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Column="@5"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                          <ScalarOperator>
                            <Compare CompareOp="LE">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="****" Schema="[dbo]" Table="[Locations]" Column="MinLng"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Column="@6"></ColumnReference>
                                </Identifier>
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                        </Logical>
                      </ScalarOperator>
                    </Predicate>
                  </IndexScan>
                </RelOp>
              </NestedLoops>
            </RelOp>
            <ParameterList>
              <ColumnReference Column="@6" ParameterDataType="numeric(6,3)" ParameterCompiledValue="(-117.646)"></ColumnReference>
              <ColumnReference Column="@5" ParameterDataType="numeric(6,3)" ParameterCompiledValue="(-118.951)"></ColumnReference>
              <ColumnReference Column="@4" ParameterDataType="numeric(5,3)" ParameterCompiledValue="(34.823)"></ColumnReference>
              <ColumnReference Column="@3" ParameterDataType="numeric(5,3)" ParameterCompiledValue="(32.750)"></ColumnReference>
              <ColumnReference Column="@2" ParameterDataType="varchar(8000)" ParameterCompiledValue="&apos;US&apos;"></ColumnReference>
              <ColumnReference Column="@1" ParameterDataType="tinyint" ParameterCompiledValue="(5)"></ColumnReference>
            </ParameterList>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>
Rick James
  • 135,179
  • 13
  • 127
  • 222
Jake McGraw
  • 135
  • 4
  • 11

1 Answers1

0

(These comments apply to MySQL; they may or may not apply to other vendors. The Answer was written before knowing that MSSql was the targer.)

Use De Morgan's law to avoid NOT and OR:

AND z.MaxLat >= d.MinLat
AND z.MinLat <= d.MaxLat
AND z.MaxLng >= d.MinLng
AND z.MinLng <= d.MaxLng

Don't do the entire SELECT until you have identified the zips and/or dmas that you need. This may allow a subquery to efficiently run in the index.

SELECT ...
    FROM ( SELECT z.id
               FROM zips z  JOIN dmas d
               WHERE d.ID = @DMAID
                 AND z.Type = 5
                 AND z.MaxLat >= d.MinLat ...
         ) x
    JOIN zips z2 ON x.id = z2.id

with this index on zips:

INDEX(Type,  -- first
      MinLat, MaxLat, MinLng, MaxLng,  -- any order
      id)   -- last

On OR, NOT

A BTree index...

  • For a 'point query', that is finding a single row, given the key.
  • A 'range' or rows. This includes a > 5, a BETWEEN 5 AND 10, `c LIKE 'foo%', etc.
  • Other types of queries are less efficient.
  • NOT usually involves two ranges, often covering most of the index: "everything before X and everything after X".
  • OR (involving a single column) involves hopping through the table.

When Multiple tables (or multiple columns) are involved, things are more complex.

  • Two tables -- mostly hopeless.
  • Two columns -- a = 5 AND ... may benefit from a composite index starting with a because a was tested with =.

The 6 column index I recommended is 'composite' and 'covering':

  • 'Composite' because it has more than 1 column.
  • Starts with Type, which is tested with '=', so more column(s) may come into play.
  • The second column is (MinLat) is tested with a range, so the rest of the columns won't be used for filtering or sorting.
  • 'Covering' comes into play because all of the columns (of zips) in the subquery (a "derived" table) are included in the index. This lets the action occur only in the index's BTree, without touching the "data" BTree.
  • In this use case, we don't really know which column would be best to put second in the index; I picked one arbitrarily.
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thank you for this. Do you have any article you can give that would expand upon why one should avoid using NOT and OR together? – Jake McGraw May 24 '18 at 17:32
  • @JakeMcGraw - See Wikipedia for understanding a BTree; then read the stuff I added to my Answer. – Rick James May 24 '18 at 18:04
  • Thanks, I appreciate your time on this. However, it appears that the SQL compilation engine does not make a distinction between the two forms of those queries "AND NOT ( ... < OR " vs "AND ( >= AND". I ran a test on each and looked over the execution plan and statistic details. Both were identical in every way, including number of rows read, operator cost, IO cost, subtree cost, etc. I suspect the compiler maps both versions of the logic path to an identical bit of code responsible for traversing the index. – Jake McGraw May 25 '18 at 02:52
  • @JakeMcGraw - Could you provide `EXPLAIN FORMAT=JSON SELECT ...` for each. – Rick James May 25 '18 at 04:05
  • I don't believe there is an MSSQL equivalent to EXPLAIN FORMAT=JSON, so I did an XML output of the execution plans and added to the original post. – Jake McGraw May 25 '18 at 15:05
  • I can't help you with MSSql; I added a tag so that you might get some relevant answers. For one thing, SQL Server might not be building BTrees. – Rick James May 25 '18 at 15:33