I have a dataset like this:
It is necessary to getting the minimum and maximum date, the person and his number from each frame:
The code sample below: enter link description here
Thanks a lot!
I have a dataset like this:
It is necessary to getting the minimum and maximum date, the person and his number from each frame:
The code sample below: enter link description here
Thanks a lot!
This is an example of a gaps-and-islands problem. I would recommend the difference of row numbers for this:
select person, number_one, count(*) as cnt,
min(person_date), max(person_date)
from (select t.*,
row_number() over (partition by person order by person_date) as seqnum,
row_number() over (partition by person, number_one order by person_date) as seqnum_2
from t
) t
group by person, (seqnum - seqnum_2)
This looks like a gaps-and-islad problem. Here is one way to solve it using window functions:
select
min(person_date) person_date_start,
max(person_date) person_date_end,
person,
number_one
from (
select
t.*,
row_number() over(order by person_date) rn1,
row_number() over(partition by person, number_one order by person_date) rn2
from mytable t
) t
group by person, number_one, rn1 - rn2
It is quite unclear what is the logic to generate the id
in the outer query. If you want to renumber the records, then you can use row_number()
:
select
row_number() over(order by min(person_date)) id,
min(person_date) person_date_start,
max(person_date) person_date_end,
person,
number_one