I have a table here with hire_date and effectivity_date of each employee which will be updated every time. I would like to get the previous row effectivity date as the next row start_date, and the end_date of the previous date will be minus 1 to avoid overlapping. Also, I want the last record's end_date to be '2030-12-31' as the default value and the Hire Date as the start_date of the first record. Please help me with this one.
-
1By the tags I see you already identified the right window functions. Can you share with us what you've tried so far? – Jim Jones Jul 22 '21 at 08:30
1 Answers
I found some duplicates in your original table, like Ross
and Chandler
have 2 entries with exactly the same dataset. If that is a copy/paste error, then I reproduced your case with:
create table testdt(name varchar, hire_dt date, eff_dt date);
insert into testdt values('Rachel','2021-03-29','2021-03-29');
insert into testdt values('Rachel','2021-03-29','2021-05-14');
insert into testdt values('Ross','2021-06-18','2021-06-18');
insert into testdt values('Monica','2021-07-21','2021-07-21');
insert into testdt values('Chandler','2021-04-12','2021-04-12');
insert into testdt values('Judy','2021-04-26','2021-04-26');
insert into testdt values('Judy','2021-04-26','2021-05-10');
insert into testdt values('Judy','2021-04-26','2021-07-01');
Now, correctly as your tag mention, you need to use the LAG
and LEAD
function which will get columns from the previous and next row accordingly.
If you run on the above testdt
table the following SQL, you'll get the previous and next eff_dt
. Please note I use the ctid
as ordering column
select *,
ctid,
lead(eff_dt) over (partition by name order by ctid) as lead_eff_dt,
lag(eff_dt) over (partition by name order by ctid) as lag_eff_dt
from testdt tst order by ctid
The result is the following
name | hire_dt | eff_dt | ctid | lead_eff_dt | lag_eff_dt
----------+------------+------------+-------+-------------+------------
Rachel | 2021-03-29 | 2021-03-29 | (0,1) | 2021-05-14 |
Rachel | 2021-03-29 | 2021-05-14 | (0,2) | | 2021-03-29
Ross | 2021-06-18 | 2021-06-18 | (0,3) | |
Monica | 2021-07-21 | 2021-07-21 | (0,4) | |
Chandler | 2021-04-12 | 2021-04-12 | (0,5) | |
Judy | 2021-04-26 | 2021-04-26 | (0,6) | 2021-05-10 |
Judy | 2021-04-26 | 2021-05-10 | (0,7) | 2021-07-01 | 2021-04-26
Judy | 2021-04-26 | 2021-07-01 | (0,8) | | 2021-05-10
Now we can apply your logic:
For the START_DATE
if it's the first row (lag_eff_dt
is null
) then we use the hire_dt
, otherwise, if there is a following row (lead_eff_dt
is not null
) then we take the the following eff_dt
. If it's the last row, we take eff_dt
.
For the END_DATE
We just need to take 2021-12-31
if there is no following record (lead_eff_dt
is null
), we take one day off from the next eff_dt
otherwise
The following query should do
with first_sel as(
select *,
ctid,
lead(eff_dt) over (partition by name order by ctid) as lead_eff_dt,
lag(eff_dt) over (partition by name order by ctid) as lag_eff_dt
from testdt tst order by ctid)
select
name,
case when lag_eff_dt is null
then hire_dt
else
case when lead_eff_dt is not null
then (lead_eff_dt)::date
else eff_dt
end
end as start,
coalesce((lead_eff_dt - INTERVAL '1 DAY')::date, '2021-12-31'::date) as end
from first_sel;
With the result being
name | start | end
----------+------------+------------
Rachel | 2021-03-29 | 2021-05-13
Rachel | 2021-05-14 | 2021-12-31
Ross | 2021-06-18 | 2021-12-31
Monica | 2021-07-21 | 2021-12-31
Chandler | 2021-04-12 | 2021-12-31
Judy | 2021-04-26 | 2021-05-09
Judy | 2021-07-01 | 2021-06-30
Judy | 2021-07-01 | 2021-12-31
(8 rows)

- 1,808
- 1
- 7
- 13
-
Wow! This is great. Another thing, I would like to new column **supervisor**, how could I implement that? I added supervisor in Select but it gets the hire_date as the startDate when assigned to a new supervisor when it should be the latest effectivity date. – JamesBowery Jul 22 '21 at 14:48
-
You need to provide a concrete example for this, otherwise we can't help – Ftisiot Jul 22 '21 at 14:54
-
Here: https://www.db-fiddle.com/f/4D8PbXMJR9tjFUiAVPZhDc/0 Judy's second record -- start date is incorrect, should be 2021-05-10 – JamesBowery Jul 22 '21 at 22:41