0

I have the following in an Azure notebook (databricks sql):

CREATE TABLE my_new_big_table AS
  SELECT t.*
  FROM my_table t
  CROSS JOIN VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) v(i);

This duplicates the my_table, 10 times, however how can I use a loop (SEQUENCE, EXPLORE) to create iterations of 100, 1000 etc?

pymat
  • 1,090
  • 1
  • 23
  • 45
  • You don't use a loop. You cross join with an ad-hoc table. Right now you create the rows of the ad-hoc table with the `VALUES` clause. Instead you can use `SEQUENCE` to create an array and `EXPLODE` to convert this array into rows. Please try it yourself first, before coming here to ask for help. – Thorsten Kettner Feb 17 '23 at 10:33
  • Does [`sequence`](https://learn.microsoft.com/en-us/azure/databricks/sql/language-manual/functions/sequence) function help? `sequence(1, 100)`? See if you can join the table with array using `explode`? i.e. `explode(sequence(1, 100))` – shahkalpesh Feb 17 '23 at 10:46
  • Yes I tried already CROSS JOIN SEQUENCE(0, 100); – pymat Feb 17 '23 at 10:51
  • If only SQL,use CROSS JOIN VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) v(i) CROSS JOIN VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) v2(i) CROSS JOIN VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) v3(i) - create 1000 iterations. – ValNik Feb 17 '23 at 11:35
  • Depending of the version of SQL Server you have, you can use GENERATE_SERIES (n, m) to generate a serie of numbers between n and m boudary values of the interval. – SQLpro Feb 17 '23 at 12:26

1 Answers1

2

This is my sample data:

enter image description here

With your code, I got the output like below which have every row repeated 10 times.

enter image description here

To do the above dynamically, you can use range(start,end) like below.

%sql
CREATE TABLE result2 AS
  SELECT t.* 
  FROM sample1 as t
  CROSS JOIN (select * from range(0,10)) v(i);
select * from result2;

My Result:

enter image description here

(Or)

Use sequence(start,end) with explode() as discussed by community in comments.

create table result3 as
SELECT t.* 
FROM sample1 as t 
CROSS JOIN (select explode(sequence(1,10))) v(i);
select * from result3;

enter image description here

Rakesh Govindula
  • 5,257
  • 1
  • 2
  • 11
  • thank you, this is more or less what I wanted. I would have liked however that ther "id" doesnt contain duplicates, so maybe replace all entries in the "id" with something unique, like the row number. – pymat Feb 22 '23 at 07:39