0

Using Oracle, I'm looking to do the following query, but I'd like to know if there is a more "intelligent" way to do it.

Select * from Sales Sales1
left join Sales Sales2 on Sales2.val = Sales1.val
left join Sales Sales3 on Sales3.val = Sales2.val
left join Sales Sales4 on Sales4.val = Sales3.val
left join Sales Sales5 on Sales5.val = Sales4.val
...

Here's what my sample data might look like

customer number | acct | start balance | open date | prev account 
       a            1         100         01-01-15       b-1     
       b            1          80         03-04-14       
       c            2         200         04-11-14       c-1
       c            1         150         06-12-15        
       d            1         600         08-16-15         
       e            3         400         12-19-15       e-2
       e            2         150         10-21-14       e-1
       e            1         100         01-18-13       

And a result set would look like this:

   Customer | start | open    | prevStart_01 | prevOpen_01 | prevStart_02 | prevOpen_02    
        a-1 |    100| 01-01-15|            80 |   03-04-14 |             | 
        c-2 |    200| 04-11-14|           150 |   06-11-14 |             | 
        e-3 |    400| 12-19-15|           150 |   10-21-14 |          100|   01-18-13 

As you can see, I need to keep joining another record of sales based upon the result, and I need to keep doing so until I return an empty result set. My current scenario is running the query and seeing whether there are values in sales5, sales6, sales7, and so on.

Patrick Schomburg
  • 2,494
  • 1
  • 18
  • 46
  • @xQbert I need a few columns from each join. – Patrick Schomburg Dec 15 '15 at 14:07
  • After looking a hierarchical query will likely not help. unpivoting the data may however. I would need to know table structure of relevant fields and sample data and expected results to know however. Here's a [example unpivot](https://oracle-base.com/articles/11g/pivot-and-unpivot-operators-11gr1) it may or may not help you as the question is still a bit vague on expected results. The issue I see is that the data isn't normalized which is making querying require additional joins. But why do you need the joins in the first place? Provide sample data expected results. – xQbert Dec 15 '15 at 14:09
  • @xQbert I added some sample data and what I'm looking for, but I think your answer is close to what I'm looking for. – Patrick Schomburg Dec 15 '15 at 14:21
  • ah then you are after a dynamic pivot, not a unpivot. you want to pivot data on prevstart and prevOpen. Here's an example: http://stackoverflow.com/questions/11987067/pivoting-of-data-using-two-columns – xQbert Dec 15 '15 at 14:40
  • @xQbert what would the IN clause look like here? The matching column is a concatenation of the customer number and account. – Patrick Schomburg Dec 15 '15 at 15:37

1 Answers1

2

Whenever you have to self-join an unknown number of times, you should be thinking CONNECT BY. Your particular need here isn't so straightforward, but CONNECT BY is still the key element of the solution.

In the SQL below, the mockup_data subfactor is just to give me some data. You'd use your actual table.

The idea is that you search your data for "root" -- records that are not a prev_account of any other record. Then, you start with those and CONNECT BY to get all their previous accounts, as many as there are. Then you PIVOT to get them all into columns.

One thing -- an Oracle SQL statement cannot have an arbitrary (data-driven) number of columns. The number must be known when the SQL is parsed. Therefore, in your PIVOT clause, you need to specify the maximum number of "levels" you'll support, so that Oracle knows how many columns the result set could have.

Here's the SQL.

WITH 
mockup_data as (
SELECT   
'a' customer_Number,           1 acct,         100 start_balance,        to_date('01-01-15','MM-DD-YY') open_date,       'b-1' prev_account from dual union all      
SELECT 'b'            ,1,          80,         to_date('03-04-14','MM-DD-YY'), null       from dual union all 
SELECT 'c'            ,2,         200,         to_date('04-11-14','MM-DD-YY'),       'c-1' from dual union all 
SELECT 'c'            ,1,         150,         to_date('06-12-15','MM-DD-YY'),        null from dual union all 
SELECT 'd'            ,1,         600,         to_date('08-16-15','MM-DD-YY'),        null from dual union all 
SELECT 'e'            ,3,         400,         to_date('12-19-15','MM-DD-YY'),       'e-2' from dual union all 
SELECT 'e'            ,2,         150,         to_date('10-21-14','MM-DD-YY'),     'e-1' from dual union all 
SELECT 'e'            ,1,         100,         to_date('01-18-13','MM-DD-YY'),     null  from dual ),
data_with_roots AS
       (SELECT d.*,
               CASE
                 WHEN (SELECT COUNT (*)
                       FROM   mockup_data d2
                       WHERE  d2.prev_account = d.customer_number || '-' || d.acct) = 0 THEN
                   'Y'
                 ELSE
                   'N'
               END
                 is_root
        FROM   mockup_data d),
     hierarchy AS
       (SELECT CONNECT_BY_ROOT (customer_number) customer_number,
               CONNECT_BY_ROOT (acct) acct,
               CONNECT_BY_ROOT (start_balance) start_balance,
               CONNECT_BY_ROOT (open_date) open_date,
               start_balance prev_start_balance,
               open_date prev_open_date,
               LEVEL - 1 lvl
        FROM   data_with_roots d
        CONNECT BY customer_number || '-' || acct = PRIOR prev_account
        START WITH is_root = 'Y'),
     previous_only AS
       (SELECT *
        FROM   hierarchy
        WHERE  lvl >= 1)
SELECT *
FROM   previous_only PIVOT (MAX (prev_start_balance) AS prev_start, MAX (prev_open_date) AS prev_open
                     FOR lvl
                     IN (1 AS "01", 2 AS "02", 3 AS "03", 4 AS "04", 5 AS "05" -- etc... as many levels as you need to support
                                                                              ));
Matthew McPeak
  • 17,705
  • 2
  • 27
  • 59
  • 1
    One idea -- it may be better to connect by in the other direction (i.e., `START WITH prev_account IS NULL`. That would allow you do use `CONNECT_BY_ISLEAF` to determine what I am calling "roots" above. It'd probably be more efficient. Something for you to play around with... – Matthew McPeak Dec 15 '15 at 15:57
  • Thank you. This is really thorough. Is it possible to, in the pivot clause, use MAX(acct) to drive the number of resulting columns, or is it essential to list them out ? – Patrick Schomburg Dec 15 '15 at 16:02
  • 1
    If you want to do this in a single query and not use dynamic SQL, you need to list them out. That's usually important for your client application as well -- to know how many columns its going to get and what they're called. If you really don't like it, you could skip the `PIVOT` and use `LISTAGG` to lump everything into one big column (delimited by pipes or whatever) and have your client application parse that. – Matthew McPeak Dec 15 '15 at 16:10