3

I'm trying to set a column in one table to a random foreign key for testing purposes. I attempted using the below query

update table1 set table2Id = (select top 1 table2Id from table2 order by NEWID())

This will get one table2Id at random and assign it as the foreign key in table1 for each row. It's almost what I want, but I want each row to get a different table2Id value.

I could do this by looping through the rows in table1, but I know there's a more concise way of doing it.

Erix
  • 7,059
  • 2
  • 35
  • 61

4 Answers4

8

On some test table my end your original plan looks as follows.

Original Plan

It just calculates the result once and caches it in a sppol then replays that result. You could try the following so that SQL Server sees the subquery as correlated and in need of re-evaluating for each outer row.

UPDATE table1
SET    table2Id = (SELECT TOP 1 table2Id
                   FROM   table2
                   ORDER  BY Newid(),
                             table1.table1Id)

For me that gives this plan without the spool.

New Plan

It is important to correlate on a unique field from table1 however so that even if a spool is added it must always be rebound rather than rewound (replaying the last result) as the correlation value will be different for each row.

If the tables are large this will be slow as work required is a product of the two table's rows (for each row in table1 it needs to do a full scan of table2)

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • That actually did work. Thanks. I didn't know SQL Server was trying to sheck if the value was correlated on the fly. Interesting. It was about 4 times as fast as my looping solution. – Erix Oct 16 '12 at 20:39
2

I'm having another go at answering this, since my first answer was incomplete.

As there is no other way to join the two tables until you assign the table2_id you can use row_number to give a temporary key to both table1 and table2.

with
 t1 as (
  select row_number() over (order by table1_id) as row, table1_id
  from table1 )
,
t2 as (
  select row_number() over (order by NEWID()) as row, table2_id 
  from table2 )

update table1
set table2_id = t2.table2_id
from t1 inner join t2
on t1.row = t2.row

select * from table1

SQL Fiddle to test it out: http://sqlfiddle.com/#!6/bf414/12

Tony
  • 9,672
  • 3
  • 47
  • 75
  • This approach should be faster than mine (I was thinking of offering a similar alternative in my answer) but needs adjusting to cope with additional rows in `table1` than `table2`. It will also have different characteristics in that each number is allocated randomly excluding those already allocated as opposed to randomly from the full set. – Martin Smith Oct 16 '12 at 20:43
  • @MartinSmith - Thanks, I'd not thought about the situation where two rows would be assigned the same `id`. If it's for testing it might not be an issue but worth bearing in mind. – Tony Oct 16 '12 at 20:48
0

Broke down and used a loop for it. This worked, although it was very slow.

Select *
Into   #Temp
From   table1

Declare @Id int

While (Select Count(*) From #Temp) > 0
Begin

    Select Top 1 @Id = table1Id From #Temp

    update table1 set table2Id = (select top 1 table2Id from table2 order by NEWID()) where table1Id = @Id

    Delete #Temp Where table1Id = @Id

End
drop table #Temp
Erix
  • 7,059
  • 2
  • 35
  • 61
-1

I'm going to assume MS SQL based on top 1:

update table1 
set table2Id = 
  (select top 1 table2Id from table2 tablesample(1 percent))

(sorry, not tested)

MK.
  • 33,605
  • 18
  • 74
  • 111
  • I'm not sure what this is supposed to do, but the select part always returns null. – Erix Oct 16 '12 at 19:58
  • @Erix http://msdn.microsoft.com/en-us/library/ms189108(v=sql.105).aspx it is supposed to return a random sample of size 1 row from the table table2. If it returns null, perhaps table2 is empty. – MK. Oct 16 '12 at 20:00
  • strange, for certain values in tablesample (such as 1 in your snippet), it returns nothing. For others, it returns a sample. For example, 21 returns nothing, while 25 returns a sample size of 25 as expected – Erix Oct 16 '12 at 20:05
  • See here if you're curious: http://www.mssqltips.com/sqlservertip/1308/retrieving-random-data-from-sql-server-with-tablesample/ – Erix Oct 16 '12 at 20:06
  • 3
    http://stackoverflow.com/questions/10725839/tablesample-returns-wrong-number-of-rows – Aaron Bertrand Oct 16 '12 at 20:14