How can I efficiently query and index a view that involves a full outer join?

Question

We have a data processing application that has two separate paths that should eventually produce similar results. We also have a database-backed monitoring service that compares and utilizes the results of this processing. At any point in time, either of the two paths may or may not have produced results for the operation, but I want to be able to query a view that tells me about any results that have been produced.

Here's a simplified example of the schema I started with:

create table LeftResult ( 
    DateId int not null, 
    EntityId int not null,
    ProcessingValue int not null

    primary key ( DateId, EntityId ) )
go

create table RightResult ( 
    DateId int not null, 
    EntityId int not null,
    ProcessingValue int not null

    primary key ( DateId, EntityId ) )
go

create view CombinedResults 
as
select
    DateId = isnull( l.DateId, r.DateId ),
    EntityId = isnull( l.EntityId, r.EntityId ),
    LeftValue = l.ProcessingValue,
    RightValue = r.ProcessingValue,
    MaxValue = case 
        when isnull( l.ProcessingValue, 0 ) > isnull( r.ProcessingValue, 0 )
        then isnull( l.ProcessingValue, 0 )
        else isnull( r.ProcessingValue, 0 )
        end
from LeftResult l
full outer join RightResult r 
    on l.DateId = r.DateId
    and l.EntityId = r.EntityId
go

The problem with this is that Sql Server always chooses to scan the PK on LeftResult and RightResult rather than seek, even when queries to the view include DateId and EntityId as predicates. This seems to be due to the isnull() checks on the results. (I've even tried using index hints and forceseek, but without avail -- the query plan still shows a scan.)

However, I can't simply replace the isnull() results, since either the left or right side could be missing from the join (because the associated process hasn't populated the table yet).

I don't particularly want to duplicate the MaxValue logic across all of the consumers of the view (in reality, it's quite a bit more complex calculation, but the same idea applies.)

Is there a good strategy I can use to structure this view or queries against it so that the query plan will utilize a seek rather than a scan?

Can you show us the select statement that uses the view and its execution plan? — acfrancis, Nov 06 '13 at 23:53
Hmm - I found a workaround that lets me use a left join instead of an outer join (basically I ensure that one side always exists). That along with heavy-handed index hints got my code performing properly again. Unfortunately that means I don't have an example handy anymore -- I'll see if I can create one again (without breaking my application :) ) — Chris Phillips, Nov 07 '13 at 01:00

score 0 · Answer 1 · answered Nov 06 '14 at 21:20

try using left outer join for one of the tables, then union those results with the excluded rows from the other table.

like:

select (...)
from LeftResult l
left outer join RightResult r 
    on l.DateId = r.DateId
    and l.EntityId = r.EntityId
(...)
UNION ALL
select (...)
from RightResult r  
leftouter join LeftResult l
    on l.DateId = r.DateId
    and l.EntityId = r.EntityId
WHERE
 l.dateid is null

How can I efficiently query and index a view that involves a full outer join?

1 Answers1