4

I'm trying to find the min and max date within a subgroup of another group. Here's example 'data'

ID   Type  Date
1     A    7/1/2015
1     B    1/1/2015
1     A    8/5/2014
22    B    3/1/2015
22    B    9/1/2014
333   A    8/1/2015
333   B    4/1/2015
333   B    3/29/2014
333   B    2/28/2013
333   C    1/1/2013

What I'd like to identify is - within an ID, what is the min/max Date for each block of similar Type? So for ID # 333 I want the below info:

A: min & max = 8/1/2015
B: min = 2/28/2013
   max = 4/1/2015
C: min & max = 1/1/2013

I'm having trouble figuring out how to identify only uninterrupted groupings of Type within a grouping of ID. For ID #1, I need to keep the two 'A' Types with separate min/max dates because they were split by a Type 'B', so I can't just pull the min date of all Type A's for ID #1, it has to be two separate instances.

What I've tried is something like the below two lines, but neither of these accurately captures the case mentioned above for ID #1 where Type B interrupts Type A.

Max(Date) OVER (Partition By ID, Type)

or this:
Row_Number() OVER (Partition By ID, Type ORDER BY Date DESC)
  ,then selecting Row #1 for max date, and date ASC w/ row #1 for min date

Thank you for any insight you can provide!

Megan
  • 43
  • 4
  • Anyone able to provide any suggestions for a potential solution? Even if you don't have the full answer, any advice on how to think about this would be really helpful! – Megan Aug 20 '15 at 22:50

1 Answers1

2

If I understand right, you want the min/max values for an id/type grouped using a descending date sort, but the catch is that you want them based on clusters within the id by time.

What you can do is use CONDITIONAL_CHANGE_EVENT to tag the rows on change of type, then use that in your GROUP BY on a standard min/max aggregation.

This would be the intermediate step towards getting to what you want:

select ID, Type, Date,
  CONDITIONAL_CHANGE_EVENT(Type) OVER( PARTITION BY ID ORDER BY Date desc) cce
from mytable
group by ID, Type, Date
order by ID, Date desc, Type

ID  Type Date                   cce
1   A    2015-07-01 00:00:00    0
1   B    2015-01-01 00:00:00    1
1   A    2014-08-05 00:00:00    2
22  B    2015-03-01 00:00:00    0
22  B    2014-09-01 00:00:00    0
333 A    2015-08-01 00:00:00    0
333 B    2015-04-01 00:00:00    1
333 B    2014-03-29 00:00:00    1
333 B    2013-02-28 00:00:00    1
333 C    2013-01-01 00:00:00    2

Once you have them grouped using CCE, you can do an aggregate on this to get the min/max you are looking for grouping on cce. You can play with the order by at the bottom, this ordering seem to make the most sense to me.

select id, type, min(date), max(date)
from (
select ID, Type, Date,
  CONDITIONAL_CHANGE_EVENT(Type) OVER( PARTITION BY ID ORDER BY Date desc) cce
from mytable
group by ID, Type, Date
) x
group by id, type, cce
order by id, 3 desc, 4 desc;

id  type min                    max
1   A    2015-07-01 00:00:00    2015-07-01 00:00:00
1   B    2015-01-01 00:00:00    2015-01-01 00:00:00
1   A    2014-08-05 00:00:00    2014-08-05 00:00:00
22  B    2014-09-01 00:00:00    2015-03-01 00:00:00
333 A    2015-08-01 00:00:00    2015-08-01 00:00:00
333 B    2013-02-28 00:00:00    2015-04-01 00:00:00
333 C    2013-01-01 00:00:00    2013-01-01 00:00:00
woot
  • 7,406
  • 2
  • 36
  • 55
  • Thank you SO MUCH. I'm sorry it took me quite a while to get back to this, but yes your solution worked and I really appreciate the help. I didn't know about conditional_change_event and that has been very helpful. Thanks again! – Megan Sep 01 '15 at 01:19