One more idea: Use the function just to do the calculation for your business logic and hand in the non-deterministic part. In your case you seem to pick a random number between zero and the highest employeeID (what about missing IDs?)
As told before RAND()
is difficult to use. It will come back with the same value in multiple calls. Therefore I use NEWID
, cast it to VARBINARY(8)
and cast this to BIGINT
.
Have a look at this:
This function will take a GUID
and scale it between the given borders:
CREATE FUNCTION dbo.GetRandomNumber(@lowerLimit BIGINT, @upperLimit BIGINT, @GuidValue UNIQUEIDENTIFIER)
RETURNS BIGINT
AS
BEGIN
RETURN
(
SELECT ABS(CAST(CAST(@GuidValue AS VARBINARY(8)) AS BIGINT)) % (@upperLimit-@lowerLimit)+@lowerLimit
)
END
GO
--This table will be filled with random values
CREATE TABLE testTable(RndNumber BIGINT,Tile INT);
--The CTE creates more than 6 mio dummy rows
WITH manyRows AS
(
SELECT 1 AS nr FROM master..spt_values CROSS JOIN master..spt_values AS x
)
INSERT INTO testTable(RndNumber)
SELECT dbo.GetRandomNumber(-300,700,NEWID()) --<-- Here I pass in the non-deterministic part
FROM manyRows;
--Now the table is tiled in 10 equal fragments
WITH UpdateableCTE AS
(
SELECT Tile
,NTILE(10) OVER(ORDER BY RndNumber) AS tileValue
FROM testTable
)
UPDATE UpdateableCTE SET Tile=tileValue;
--check the random result
SELECT * FROM testTable
ORDER BY Tile,RndNumber;
--This shows clearly, that the covered range per tile is almost the same which is a proof for a fairly good random spread
SELECT Tile
,COUNT(*) CountOfNumbers
,MAX(RndNumber)-MIN(RndNumber) CoveredRange
FROM testTable
GROUP BY Tile
ORDER BY Tile;
GO
--clean up
DROP TABLE dbo.testTable;
DROP FUNCTION dbo.GetRandomNumber;
The result
T Counts min max CoveredRange
1 636553 -300 -201 99
2 636553 -201 -101 100
3 636553 -101 0 101
4 636553 0 99 99
5 636553 99 199 100
6 636553 199 299 100
7 636553 299 399 100
8 636553 399 499 100
9 636553 499 599 100
10 636552 599 699 100
You can see, that each tile covers roughly the same count of elements. The elements inside are covering almost the same range. This shows, that the numbers are evenly spread within the table.