3

I'm running into a roadblock that seems like a simple fix, but I can't seem to figure out a solution. I'm hoping some folks on here with a bit more knowledge in Oracle SQL can help.

Essentially, I am building a list of jobs ranked by complete date, which is simple enough. The common link between them is that they are all for the same customer (distinct number) and at the same location (distinct number). My issue is that I need to find a way to interrupt and restart that rank/count from 1 every time the duration between two jobs is above 30 days. Here is a simplified version of what I'm looking at:

JOBCOUNT  ACCTNUM   LOCNUM      COMPDATE           DURATION  
--------  -------   ------      --------           -------- 
  2        001       003      8/21/2015 16:47     15.48763889        
  3        001       003      10/5/2015 11:31     41.98304398        
  4        001       003      10/19/2015 9:59     13.21804398        
  5        001       003      11/13/2015 15:23    24.43752315         
  6        001       003      11/30/2015 19:19    11.35537037        

How I would like to see it, due to the duration of over 30 on the second line, is as follows:

JOBCOUNT  ACCTNUM   LOCNUM      COMPDATE           DURATION  
--------  -------   ------      --------           -------- 
  2        001       003      8/21/2015 16:47     15.48763889        
  1        001       003      10/5/2015 11:31     41.98304398        
  2        001       003      10/19/2015 9:59     13.21804398        
  3        001       003      11/13/2015 15:23    24.43752315         
  4        001       003      11/30/2015 19:19    11.35537037 

Because the second job falls outside of the 30-day window, the chain should start over at one with the next job. My issue is that I cannot find a way to partition the data so that this criteria is recognized and also starts over the count. There is not another column that would allow me to partition the row_number or dense_rank in such a way (for instance, a common order key or systemic job sequence for each 30-day chain).

I have tried about 20 different ways to parse this data from threads I've read on here to no avail, so any help or ideas on how this can be implemented would be appreciated in a huge way. I have about 50,000 lines of data to which this sequencing needs to be applied. I've been teaching myself SQL for about a year now on the fly and on the job. I've hit my knowledge limit on this one.

Danny P.
  • 33
  • 1
  • 3

1 Answers1

2

Basically, you need an additional grouping. You can calculate the grouping by doing a cumulative sum on a flag calculated by the difference being greater then 30. Then the rest seems like row_number():

select t.*,
       row_number() over (partition by acctnum, locnum, grp order by compdate) as jobcount  
from (select t.*,
             sum(case when duration > 30 then 1 else 0 end) over
                 (partition by acctnum, locnum order by compdate) as grp
      from t
     ) t;

However, your sample data starts at 2 and not 1, which I don't fully understand.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thanks, I will implement and update. The sample starts with 2 because the original job is being excluded earlier in the query due to it not being necessary in the output to be distributed (I have some columns joined to the end of my original data pull that add in the previous job info for reference, so the first job would be redundant). It can be re-included and excluded later in the process, if necessary. – Danny P. Feb 03 '16 at 22:43
  • Worked a charm. Thank you, Gordon. – Danny P. Feb 03 '16 at 23:09
  • 1
    years later and still helped me out big time. thank you – Jon Ekiz Jan 12 '21 at 01:15