1

We have a huge table with millions of records. We want to fetch only first record based on some key in that table. When I tried using self-join it's resulting in temp db space issue. From the performance point of view, I was thinking of using an indexed view. But I can't use left join or CTE in an indexed view.

Is there any way available to fetch only non duplicate records using indexed views?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Icarus
  • 415
  • 4
  • 12
  • 1
    It's been a while so i'm unsure if you can, but have you looked into using windowing functions within the indexed view? – Rich Benner Nov 12 '18 at 10:10
  • That's a very good point - try using a windowing function instead of a self join. Indexes views are usually no solution for anything. – Nick.Mc Nov 12 '18 at 10:59
  • We have a view which is used in more than 50 places so I was thinking of achieving the desired result with minimum impact. – Icarus Nov 12 '18 at 11:53
  • @RichBenner I tried rank function CTE but the indexed view is not allowing them. maybe I need to drop that views and create a table with the same name ;) underline table updated once daily. – Icarus Nov 12 '18 at 11:55
  • No, you won't be able to use a CTE but you should be able to use a window function using `ROW_NUMBER()` in your select. https://www.red-gate.com/simple-talk/sql/t-sql-programming/introduction-to-t-sql-window-functions/ – Rich Benner Nov 12 '18 at 12:12
  • What do you mean "first record"? Ordered by some column other than the "key" ones? – serge Nov 12 '18 at 12:17

1 Answers1

0

The scenario is that an indexed view contains distinct values of the "key" that are joined to the first row on the source table.

Prepare example data

SET NOCOUNT ON;
CREATE TABLE dbo.t (
  product_name nvarchar(20), 
  vendor_name nvarchar(20)
)
GO
INSERT INTO dbo.t (product_name, vendor_name) VALUES 
('SQL Server', 'Microsoft'),
('Oracle', 'Oracle'),
('DB2', 'IBM'),
('Oracle', 'Oracle'),
('Oracle', 'Oracle'),
('DB2', 'IBM'),
('DB2', 'IBM'),
('DB2', 'IBM');

Create indexed view

IF OBJECT_ID ('dbo.v_t', 'view') IS NOT NULL  
    DROP VIEW dbo.v_t
GO
CREATE VIEW dbo.v_t
WITH SCHEMABINDING
AS
    SELECT COUNT_BIG(*) prod_count, vendor_name
    FROM dbo.t
    GROUP BY vendor_name
GO
CREATE UNIQUE CLUSTERED INDEX IX1_v_t
    ON dbo.v_t (vendor_name);  
GO  

The query

SELECT t.* 
FROM dbo.v_t v
CROSS APPLY(SELECT TOP 1 * 
            FROM dbo.t t 
            WHERE v.vendor_name = t.vendor_name 
            ORDER BY t.product_name) t

Result

product_name         vendor_name
-------------------- --------------------
DB2                  IBM
SQL Server           Microsoft
Oracle               Oracle
serge
  • 992
  • 5
  • 8
  • if i change insert statement to INSERT INTO dbo.TBD_TEST (product_name, vendor_name) VALUES ('SQL Server', 'Microsoft'), ('Oracle', 'Oracle'), ('DB2', 'IBM1'), ('Oracle', 'Oracle'), ('Oracle', 'Oracle'), ('DB2', 'IBM2'), ('DB2', 'IBM3'), ('DB2', 'IBM4'); i need only IBM1 and discard other without changing the way i use view in stored procedures – Icarus Nov 12 '18 at 13:45
  • @Icarus it's a test data example to process and discard unused rows later – serge Nov 12 '18 at 14:13