I'm struggling to identify effective indexes (or rewrite the query) to improve a query with the following confounding predicates:
- JOIN on a date from one table being in range - between two date fields on second table (one is nullable, one is not nullable in PK).
- The date used is actually the value in date field (nullable) +1.
- WHERE clauses includes OR logic on multiple flag fields.
The simplified version of the query is:
select
d.dim_date_id
,f.dim_provider_id
,f.dim_event_id
,d.date
from DWH.dbo.tbl_fact_outcome f
join DWH.dbo.tbl_dim_date d on DATEADD(DAY,1,d.date) between f.known_from and f.known_to
where
f.known_from > getdate()-12
and (d.flag_latest_day = 'Y' or d.flag_end_of_month = 'Y' or (d.flag_end_of_week = 'Y' AND d.flag_latest_week = 'Y'))
and d.flag_future_day = 'N'
and f.deleted = 0
tbl_fact_outcome has these indexes:
- PK clustered index on input_form_id, known_from
- Non-unique Nonclustered index on deleted, known_from, known_to (INCLUDES the required _dim_id fields)
tbl_dim_date has these indexes:
- PK clustered index on dim_date_id
- Non-unique nonclustered index on flag_future_day, date (INCLUDES relevant flag fields)
At present, it estimates 853 rows but returns 16,784.
Here is the query plan: https://www.brentozar.com/pastetheplan/?id=rydKb_3AI
Statistics are up to date. I have tried re-ordering the covering indexes but no improvement.
I'm totally stumped as to what else to try with indexes or the code itself to improve performance, so any pointers appreciated.
EDIT 05/07/2020
Ruled out following suggestions here:
- Filtered index (on deleted) on tbl_fact_outcome - less than 1% of records would be filtered out, so not worthwhile
- Filtered index (using entire WHERE clause from query) on tbl_dim_date - not possible to use OR in index
- Index on tbl_dim_date with INCLUDEd fields as key fields - tried this, made no difference, not used by optimizer.