1

I have a table where there are 4 columns namely :Book_no,Prev_Book_no(previous edition book_no),Edition_no(starts with 0 upto latest edition no(increments by 1)), Tree(Initially null).

What I intend to do is to create a tree of Book_no starting from highest edition to edition no 0.

Ex: (Book_no,Prev_Book_no,Edition_no) Values: (bbb,null,0), (ccc,bbb,1), (ddd,ccc,2), (eee,ddd,3), (fff,eee,4)

Then tree for every row should be:(bbb),(ccc,bbb),(ddd,ccc,bbb),(eee,ddd,ccc,bbb) and (fff,eee,ddd,ccc,bbb).

For this I tried using Recursive CTE with code as below:

with cte(book_no,prev_book_no,tree,edition_no) as

(select

book_no,prev_book_no,tree,edition_no

from books

where edition_no>=0

union all

select e.book_no,e.prev_book_no,concat(nvl(e.tree,''),','+e.prev_book_no),e.edition_no

from

cte e inner join books f

on e.prev_book_no=f.book_no

)

select distinct * from cte

order by edition_no;

But Redshift does not allow using cte table in from clause and gives the error. 'Table cte does not exist'.

P.S There are multiple different books in table.For example I have mentioned only one.

Is there any alternative to this in Redshift?

Edit:

Sample Data:

Sample Data

Expected Output:

Expected Output

Logic: I want to get a hierarchy of book no's and am using concat operation for the same.(Current code might be wrong) Thanks!

NewCoder
  • 31
  • 1
  • 5

3 Answers3

1

Recursive CTEs are now supported in Redshift starting April 29th, 2021 using the WITH RECURSIVE syntax:

WITH RECURSIVE cte(book_no, prev_book_no, tree, edition_no) AS (
    SELECT book_no, prev_book_no, tree, edition_no
    FROM books
    WHERE edition_no >= 0
    UNION ALL
    SELECT e.book_no, e.prev_book_no, NVL(e.tree,'')||','||e.prev_book_no), e.edition_no
    FROM cte         e
    INNER JOIN books f ON e.prev_book_no = f.book_no)
SELECT DISTINCT *
FROM cte
ORDER BY
    edition_no;

https://aws.amazon.com/about-aws/whats-new/2021/04/amazon-redshift-announces-support-for-heirarchical-data-queries-with-recursive-cte/ https://docs.aws.amazon.com/redshift/latest/dg/r_WITH_clause.html#r_WITH_clause-recursive-cte

Joe Harris
  • 13,671
  • 4
  • 47
  • 54
0

You correctly choosed recursive CTE. Unfortunately, Redshift does not support them. See also this similar question. I am not aware of any SQL feature which could be considered as full value workaround. Perhaps if number of levels is practically limited, you could use finite number of left self joins (ugly, I know).

(To your CTE - although little bit offtopic - it seems condition where edition_no=0 would suffice as a seed in first part of union, also connect condition in second query should be turned into on f.prev_book_no=e.book_no.)

Tomáš Záluský
  • 10,735
  • 2
  • 36
  • 64
0

Recursive queries are not supported by Amazon Redshift database. https://docs.aws.amazon.com/redshift/latest/dg/c_unsupported-postgresql-features.html

I'm not good in this dialect. Try this code, probably it works and solves your problem:

select book_no,prev_book_no,edition_no,
(
select listagg(innr.book_no,',') within group (order by innr.edition_no asc) over() 
from books innr 
where innr.book_no=outr.book_no and innr.edition_no<=outr.edition_no
) tree
from books outr
order by book_no

Variant with a loop. Try to execute it:

create local temporary table IF NOT EXISTS 
table1 (book_id bigint, book_no varchar(10), edition_no varchar(10));

insert into table1 (book_id, book_no, edition_no) 
select row_number () over (order by book_no) as book_id, book_no, edition_no 
from books where edition_no=0;

FOR iter IN SELECT distinct edition_no 
FROM books where edition_no!=0 ORDER BY edition_no LOOP
  insert into table1 (book_id, book_no, edition_no) 
  select t.book_id, b.book_no, b.edition_no 
  from table1 t join books b on t.book_id=b.prev_book_no
  where b.edition_no = iter
  ;
  commit;
  END LOOP;

select book_no,prev_book_no,edition_no,
(
select listagg(innr.book_no,',') within group (order by innr.edition_no asc) over() 
from books innr inner join table1 tbl_innr on innr.book_no=tbl_innr.book_no
where tbl_innr.book_id=tbl_outr.book_id and innr.edition_no<=outr.edition_no
) tree
from books outr join table1 tbl_outr on outr.book_no=tbl_outr.book_no
order by book_no
;
drop table table1;
Vad1m
  • 389
  • 1
  • 14
  • Thanks for the solution. But it seems to work for only one such tree, when I try it on entire dataset, it gives the error : Result size exceeds LISTAGG limit. (Probably it is grouping all books in a particular edition number). Could you include this in your code as well? – NewCoder May 18 '20 at 14:58
  • @NewCoder Do you have any ID for entire book (same for all editions)? – Vad1m May 18 '20 at 15:07
  • No, there is no such ID column – NewCoder May 18 '20 at 16:10
  • @NewCoder I've just added new version. Try it – Vad1m May 18 '20 at 18:18