1

I want to write a dynamic script that removes duplicates. I want to try and avoid a CURSOR so I've been looking into writing strings instead that will have table in one column and corresponding table attributes in another. I have also tried dynamic SQL using WITH. But this is what I have so far. This I intend to use as parameters in dynamic SQL later on

STUFF example. However this results in repeating the same column names for every row:

select name as table_name,
    stuff(( select  ', ' +char(10)+ ac.[name] FROM DW.sys.columns ac 
            inner join DW.sys.tables t on ac.object_id=t.object_id
            where ac.name not in ('ModifiedOn','ValidFrom','ValidTo')
            FOR XML PATH('')
            ), 1, 1, '')
    from sys.tables 

What I want is this output:

TableName || ColumnName  
table1    || aa,ab,ac      
table2    || ba,bb,bc    
table3    || ca,cb,cc   

My idea is to use this to this effect or similair:

'WITH DELETEDUPLICATE AS (
    SELECT '+@ColumnName+',
    ROW_NUMBER() OVER(PARTITION BY '+@ColumnName+' ORDER BY '+@ColumnName+') AS Duplicate_Row_Count
    FROM '+@TableName+'
    )
DELETE
FROM DELETEDUPLICATE
WHERE Duplicate_Row_Count > 1 

Any ideas appreciated!

UPDATE: With satishcse's suggestion i get the table I wanted. I had problem with getting multiple rows in the dynamic WITH step so I just removed that part as a varaible (removed away 'SET @WITH =' ). But how to execute every row? what i get now is:

WITH DELETEDUPLICATE AS(....

For every table per row

CobraKaj
  • 37
  • 6
  • I don't want to be that guy... but why not just do this once and then apply a unique constraint so there is no more duplicates? Fix the problem at its root. – S3S Oct 19 '18 at 16:47
  • I know..If we could, we would. – CobraKaj Oct 19 '18 at 16:57
  • Why can’t you? You can delete from it but not create a unique constraint? What about a after insert trigger? – S3S Oct 19 '18 at 21:27

2 Answers2

1

In OpenQuery you have to run the query using execute() function. The answer can solve your problem, but I do not suggest you use OpenQuery.

declare @query as nvarchar(max)

set
    @query = 
    'WITH DELETEDUPLICATE AS (
        SELECT '+@ColumnName+',
        ROW_NUMBER() OVER(PARTITION BY '+@ColumnName+' ORDER BY '+@ColumnName+') AS Duplicate_Row_Count
        FROM '+@TableName+'
        )
    DELETE
    FROM DELETEDUPLICATE
    WHERE Duplicate_Row_Count > 1'

execute(@query)
mohabbati
  • 1,162
  • 1
  • 13
  • 31
  • Why dont you suggest OpenQuery? – CobraKaj Oct 19 '18 at 16:58
  • 1
    OpenQuery is not a good practice because, for example, **debugging problems** and you cannot use **execution plan** and OpenQuery does not accept **variables for its arguments** and the query may return multiple result-sets but OpenQuery **returns only the first one** and some reasons more. – mohabbati Oct 19 '18 at 17:14
1

try the following for the first part:

select name as table_name,
    stuff(( select  ', ' +char(10)+ ac.[name] FROM DW.sys.columns ac 
            inner join DW.sys.tables t on ac.object_id=t.object_id
            where ac.name not in ('ModifiedOn','ValidFrom','ValidTo')
            and st.name = t.name
            order by 1
            FOR XML PATH('')
            ), 1, 1, '')
    from sys.tables st
sacse
  • 3,634
  • 2
  • 15
  • 24
  • of course it was something as simple as connecting the tables st and t.. Thanks! Now Im on the problem of executing all the rows I get after the 'WITH DELETEDUPLICATE AS...' – CobraKaj Oct 23 '18 at 11:53
  • how are you assigning values to @ column_name and @ tablename once you have the table in the first step...? – sacse Oct 23 '18 at 17:40
  • I do an insert into an table variable. From there I just use the columns from that variable e.g. '+ColumnList+' bla bla FROM '+@TableVariable+', inside above dynamic WITH query. Se above block/yellow box I added in original post. Why I want to execute it from string rows (or whatever you call it) is because I want to avoid using cursor. – CobraKaj Oct 23 '18 at 18:24
  • ;WITH DELETEDUPLICATE AS ( SELECT ColumnName, ROW_NUMBER() OVER(PARTITION BY ColumnName ORDER BY ColumnName) AS Duplicate_Row_Count FROM @tablename (table which we populated) ) delete FROM DELETEDUPLICATE WHERE Duplicate_Row_Count > 1 will delete the entries from temporary table only....I am not sure if that's the objective. – sacse Oct 25 '18 at 17:28
  • No, the resulting code, per row will be: WITH DELETEDUPLICATE as ( column1, column2,.., ROW_NUMBER() OVER(PARTITION BY column1, column2,... ORDER BY AS Duplicate_Row_Count FROM tableA) DELETE FROM DELETEDUPLICATE WHERE Duplicate_Row_Count > 1 and then the same for another table and its sets of columns. Now, How to execute the rows, preferable without cursor.. – CobraKaj Oct 26 '18 at 09:10