3

I am trying to create random data and insert it into a table. Right now just thinking what'd be the efficient approaches to get this done. e.g.

Create  volatile table mytb , no fallback, no journal
( C1 integer not null
  C2 Varchar (50) Not null , 
  C3 D1 Date Not null, 
  C4 D2 date not null 
) with data primary index ( c1) on commit preserve rows; 

What I want is to insert value randomly for X iterations for a specific List or range of each column value . e.g. C1 range is between 30 and 3000000 C2 is a list with ( 'approved','pending','unknown','disputed','wip','processed','pre-processed','denied' ) etc C3 is a date between 01-01-1999 to 12-31-2015 etc Then for say 1 million iterations I'd like to insert random values for these columns and CREATE a SKEW for certain values- that is there should be abundance of these values vs the rest. Has someone had a dig at this before . What the best way to do it - recursive Q logic ?

dnoeth
  • 59,503
  • 4
  • 39
  • 56
user1874594
  • 2,277
  • 1
  • 25
  • 49

1 Answers1

7

I use RANDOM to produce test data:

SELECT
   RANDOM(30,3000000) AS c1,
   CASE RANDOM(1,8) 
      WHEN 1 THEN 'approved'
      WHEN 2 THEN 'pending'
      WHEN 3 THEN 'unknown'
      WHEN 4 THEN 'disputed'
      WHEN 5 THEN 'wip'
      WHEN 6 THEN 'processed'
      WHEN 7 THEN 'pre-processed'
      WHEN 8 THEN 'denied'
   END,
   DATE '1999-01-01' + RANDOM(0,6208) -- up to 2015-12-31
FROM sys_calendar.CALENDAR -- any table with a large number of rows

This results in evenly distributed data, if you want skew you can run different insert/selects or play around with multiple RANDOMs:

RANDOM(1,50) + RANDOM(0,50)
(RANDOM(1, 50) * RANDOM(1,200) + RANDOM(0,100)) / 100.00
dnoeth
  • 59,503
  • 4
  • 39
  • 56
  • TY . How do you run this ( recursively ) for X iterations. – user1874594 Jan 07 '16 at 19:01
  • Never mind - I did nt notice the sys calender giving the iterations. I was aware of the random part- just wondered how'd'ya get those iterations and whalla... there it is ...no need for all the complicated logic I imagined. – user1874594 Jan 07 '16 at 19:07
  • 1
    This just might come in handy! Thanks! – Rob Paller Jan 07 '16 at 22:23
  • select top N RANDOM(30,3000000) AS c1, CASE RANDOM(1,8) WHEN 1 THEN 'approved' WHEN 2 THEN 'pending' WHEN 3 THEN 'unknown' WHEN 4 THEN 'disputed' WHEN 5 THEN 'wip' WHEN 6 THEN 'processed' WHEN 7 THEN 'pre-processed' WHEN 8 THEN 'denied' END, DATE '1999-01-01' + RANDOM(0,6208) FROM sys_calendar.CALENDAR would get you N rows – louigi600 Jan 18 '21 at 16:04