1

I want to cleanse some columns in a table of a massive database with random data. I have created a stored procedure CreateRandomString to create random data (to replace client names etc so we can use the data as a demo). Now I want to create the stored procedure that will take a table name and a column name and replace all the data in that column with random values (each column with a different random value).

I have done mySQL and Oracle in the past, but am now working in SQL Server, and I am stuck with the updating etc. I will not know the key etc on the table, as this is passed in as a parameter.

What I have so far:

CREATE OR ALTER PROC SanitiseData
    (@DataColumnName NVARCHAR(MAX),
     @DataTableName NVARCHAR(MAX))
AS
BEGIN
    DECLARE @RandomString NVARCHAR(MAX)

    ;WITH r AS
    (
         EXEC('SELECT ' + @DataColumnName + ', rn = ROW_NUMBER() OVER (ORDER BY ' + @DataColumnName + ') - 1
               FROM ' + @DataTableName)
    )

And to get a random string I just do:

EXEC CreateRandomString @RandomString = @RandomString OUTPUT

However, I can't use an EXEC there, so how can I use variables for table and field names, and then how can I loop through and update each row with a different value?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
MicWit
  • 655
  • 12
  • 31

2 Answers2

2

This should do what you are looking for. Using unique identifiers to splatter your data with meaningless values though you could modify that as you see fit.

CREATE OR ALTER PROC SanitiseData
        (@DataColumnName NVARCHAR(MAX),
         @DataTableName NVARCHAR(MAX))
    AS
    BEGIN
        DECLARE @sql nvarchar(1024) = 'UPDATE ' + @DataTableName + ' SET ' +  @DataColumnName + ' = CAST(NEWID() as varchar(36))';
        EXEC sp_executesql @sql
    END;
Vincent
  • 842
  • 7
  • 13
  • This is good, but I need to get the value from the stored procedure so I can tweak how the values are created. This means I need to generate a new value for each record from the stored procedure. – MicWit Jan 23 '19 at 23:03
  • This will generate a new value for each record. – Vincent Jan 23 '19 at 23:51
  • If you want to get the results of this call back out you could add an output clause ie output inserted.* or inserted.@DataColumnName – Vincent Jan 23 '19 at 23:53
0

Hi i think this query can be work :

CREATE OR ALTER PROC SanitiseData
    (@DataColumnName NVARCHAR(MAX),
     @DataTableName NVARCHAR(MAX))
AS
BEGIN
    DECLARE @Query NVARCHAR(MAX)
@Query = 'SELECT ' + @DataColumnName + ', rn = ROW_NUMBER() OVER (ORDER BY ' + @DataColumnName + ') - 1
               FROM ' + @DataTableName
    Exec(@Query)

Sory for bad understanding :

You cas performing a update query dynamically like the query perform select.

@Query='Update ' + @tablename +f'set' + @collname + ‘ =´ + @newvalue + ´where ‘ +@whereclause

And then exec @query

Exec(@query)

EDIT for user function on update example:

CREATE FUNCTION ufAleaRandomString (@InputValue     varchar(MAX))
RETURNS varchar(MAX)
AS
BEGIN
    DECLARE @RetourValue varchar(MAX)
    SELECT @RetourValue = @InputValue + 'SomeModificationValue'
    RETURN @RetourValue
END
GO

CREATE TABLE #MyTable  
(PrimaryKey   int PRIMARY KEY,  
   KeyTransacFull      varchar(50) 
  );  
GO  


INSERT INTO #MyTable 
SELECT 1, '|H000021|B1|'

INSERT INTO #MyTable 
SELECT 2, '|H000021|B1|'


SELECT * FROM #MyTable


UPDATE #MyTable SET KeyTransacFull = dbo.ufAleaRandomString(KeyTransacFull) 

SELECT * FROM #MyTable

DROP TABLE  #MyTable

RESULT :

PrimaryKey     KeyTransacFull
1             |H000021|B1|
2             |H000021|B1|


PrimaryKey       KeyTransacFull
1               |H000021|B1|SomeModificationValue
2               |H000021|B1|SomeModificationValue
Sanpas
  • 1,170
  • 10
  • 29
  • But this means every row will have the value of @newvalue. The point is I want to get a different value from the stored procedure for every record. – MicWit Jan 23 '19 at 22:20
  • You can change your stored procedure that provided the @newvalue to a function and return the new calculated value and then you call the function getNewValue(@param) on the Update clause – Sanpas Jan 24 '19 at 06:49
  • Change CreateRandomString to function that return varchar and call this on set clause off the update – Sanpas Jan 24 '19 at 06:51
  • @MicWit Hi, i have edit to use random function on update, you can use this way if i understand what you want. – Sanpas Jan 24 '19 at 08:04