Some of my tables are of type REPLICATE. I would these tables to be actually replicated (not pending) before I start querying my data. This will help me avoid data movement.
I have a script, which I found online, which runs in a loop and do a SELECT TOP 1 on all the tables which are set for replication, but sometimes the script runs for hours. It may seem as the server sometimes won't trigger replication even if you do a SELECT TOP 1 from foo.
How can you force SQL Datawarehouse to complete replication?
The script looks something like this:
begin
CREATE TABLE #tbl
WITH
( DISTRIBUTION = ROUND_ROBIN
)
AS
SELECT
ROW_NUMBER() OVER(
ORDER BY
(
SELECT
NULL
)) AS Sequence
, CONCAT('SELECT TOP(1) * FROM ', s.name, '.', t.[name]) AS sql_code
FROM sys.pdw_replicated_table_cache_state AS p
JOIN sys.tables AS t
ON t.object_id = p.object_id
JOIN sys.schemas AS s
ON t.schema_id = s.schema_id
WHERE p.[state] = 'NotReady';
DECLARE @nbr_statements INT=
(
SELECT
COUNT(*)
FROM #tbl
), @i INT= 1;
WHILE @i <= @nbr_statements
BEGIN
DECLARE @sql_code NVARCHAR(4000)= (SELECT
sql_code
FROM #tbl
WHERE Sequence = @i);
EXEC sp_executesql @sql_code;
SET @i+=1;
END;
DROP TABLE #tbl;
SET @i = 0;
WHILE
(
SELECT TOP (1)
p.[state]
FROM sys.pdw_replicated_table_cache_state AS p
JOIN sys.tables AS t
ON t.object_id = p.object_id
JOIN sys.schemas AS s
ON t.schema_id = s.schema_id
WHERE p.[state] = 'NotReady'
) = 'NotReady'
BEGIN
IF @i % 100 = 0
BEGIN
RAISERROR('Replication in progress' , 0, 0) WITH NOWAIT;
END;
SET @i = @i + 1;
END;
END