I have been going through stack overflow to try and work this out over the last week and I still can't work out a viable solution so was wondering if anyone could offer me some help/advice?
Explanation of the data structures
I have the following tables:
Position table (zz_position
) which is used to hold the details of the
position (Job ID) include the date range that it is valid for.
PosNo Description Date_From Date_To
---------------------------------------------------------
10001 System Administrator 20170101 20231231
Resource table (zz_resource
) which is used to hold the details of a resource (employee) including the date that they joined the company and left it
resID description date_from date_to
------------------------------------------
100 Sam 20160101 20991231
101 Joe 20150101 20991231
Employment table (zz_employment
) which is used to link position to resources within a date from and to range
PosNo resID Date_From Date_To seqNo
---------------------------------------------------
10001 100 20180101 20180401 1
10001 101 20180601 20191231 2
10001 100 20200101 20991231 3
Problem
Now due to people changing positions, a post might not be filled for a period of time and what I am trying to do is produce a report that I can use to give me a breakdown of the status of a post at any point in time.
I know that I can produce one which fully maps each day using a calendar table however what I want is a report which produces the data in the following aggregated format:
PosNo resID Date_From Date_To seqNo
-------------------------------------------------
10001 NULL 20170101 20171231 0
10001 100 20180101 20180401 1
10001 NULL 20180402 20180530 0
10001 101 20180601 20191231 2
10001 100 20200101 20231231 3
insert into zz_employment
values ('10001', '100', '2018-01-01 00:00:00.000', '2018-04-01 00:00:00.000', 1),
('10001', '101', '2018-06-01 00:00:00.000', '2019-12-31 00:00:00.000', 2),
('10001', '100', '2020-01-01 00:00:00.000', '2099-12-31 00:00:00.000', 3)
(note how the report has taken the two lines in the table and produced a fully speced out life of the employment where the first null line date from is pulled from the position start date and the last line date to is pulled from the position end date.
Ideally I would like this as a view/function however due to the complexity I would be more than happy to have a series of T SQL statements that I can run each night as part of a data warehouse routine.
Rules
- all dates are truncated to datetime so that an date_to is referencing the date that it ends not the date and time that it ends
- if the post/employment/resource has no end date then it will be denoted as 20991231
- if the employment itself is open ended then the date to in the employment table is denoted as 20991231 even through the position itself might end in 20231231. Ideally I would like the result to respect the position end date.
SQL code:
CREATE TABLE zz_position
(
posNo varchar(25) NOT NULL,
description varchar(25) NOT NULL,
date_from datetime NULL,
date_to datetime NULL
)
insert into zz_position
values ('10001', 'System Administrator', '2017-01-01 00:00:00.000', '2020-12-31 00:00:00.000')
go
CREATE TABLE zz_resource
(
resID varchar(25) NOT NULL,
description varchar(25) NOT NULL,
date_from datetime NULL,
date_to datetime NULL
)
insert into zz_resource
values ('100', 'Sam', '2016-01-01 00:00:00.000', '2099-12-31 00:00:00.000'),
('101', 'Joe', '2015-01-01 00:00:00.000', '2099-12-31 00:00:00.000')
go
CREATE TABLE zz_employment
(
posNo varchar(25) NOT NULL,
resID varchar(25) NOT NULL,
date_from datetime NULL,
date_to datetime NULL,
seqNo int NULL
)
insert into zz_employment
values ('10001', '100', '2018-01-01 00:00:00.000', '2018-04-01 00:00:00.000', 1),
('10001', '101', '2018-06-01 00:00:00.000', '2019-12-31 00:00:00.000', 2),
('10001', '100', '2020-01-01 00:00:00.000', '2099-12-31 00:00:00.000', 3)