0

I am in a Microsoft SQL Server database. I would like to consolidate start and end periods for employees. For example, given the following simple table:

create table dbo.foo (
employee_key bigint,
effective_start_date date,
effective_end_date date,
account_name varchar(100));


insert into foo (employee_key,
effective_start_date,
effective_end_date,
account_name)
values (1
    ,'2017-01-01'
    ,'2017-01-31'
    ,'Google')
,(1
    ,'2017-02-01'
    ,'2017-02-28'
    ,'Apple')
,(1
    ,'2017-03-01'
    ,'2017-03-31'
    ,'Google')
,(1
    ,'2017-04-01'
    ,'9999-12-31'
    ,'Google')

The employee has moved accounts a couple times. I would like to maintain the change in February and back again. However, I do not want to see 2 records at the end. Instead, I want the tenure at Google to run from 2017-03-01 to 9999-12-31.

By the way, this is one example. That is, the "broken" records aren't always at the end of the employee's tenure.

Expected results:

employee_key | effective_start_date | effective_end_date | account_name
1            | 2017-01-01           | 2017-01-31         | Google
1            | 2017-02-01           | 2017-02-28         | Apple
1            | 2017-03-01           | 9999-12-31         | Google
Pablo Boswell
  • 805
  • 3
  • 13
  • 30
  • Can you post your expected results? Don't explain them - actually show us the columns and rows that you want returned. – Stan Shaw Jul 24 '17 at 17:28
  • Thank you for clarifying. A bigger concern is how this data is being inserted into your table. Why is a new record created, instead of updating the `effective_end_date` of the appropriate record? – Stan Shaw Jul 24 '17 at 17:52
  • I think the simplest answer is: dirty data. Or that another minor attribute of the fact (that I am ignoring) changed. But if I don't care about this minor attribute, I want to be able to regroup my data. I thought this might be an insurmountable problem, and if so, that's fine. I just need to tell my boss. – Pablo Boswell Jul 24 '17 at 17:54
  • It is not insurmountable. You could easily solve this with a `CURSOR` however, I am trying to come up with a better solution using CTEs. Is it possible in your data to have THREE consecutive employee key/account name combinations? In your sample data, it only occurs with TWO consecutive records... – Stan Shaw Jul 24 '17 at 17:55

2 Answers2

2

You can use row_number and get this results... I used two subqueries but it can be solved without that as well:

;With Cte as (
select *, Bucket = sum(accn) over(partition by employee_key order by effective_start_date) 
    from (
        Select *,case when account_name <> lag(account_name) over(partition by employee_key order by effective_start_date) 
            then 1 else 0 end as AccN
            from foo ) A
    )
    Select top (1) with ties employee_key, Min(effective_start_date) over(partition by employee_key,Bucket) AS [effective_start_date],      
                Max(effective_end_date) over(partition by employee_key,Bucket) AS [effective_end_date], account_name
    from cte
    order by row_number() over(partition by employee_key, bucket order by effective_start_date)

Output as below:

+--------------+----------------------+--------------------+--------------+
| employee_key | Effective_start_Date | Effective_End_date | account_name |
+--------------+----------------------+--------------------+--------------+
|            1 | 2017-01-01           | 2017-01-31         | Google       |
|            1 | 2017-02-01           | 2017-02-28         | Apple        |
|            1 | 2017-03-01           | 9999-12-31         | Google       |
+--------------+----------------------+--------------------+--------------+

Demo

Kannan Kandasamy
  • 13,405
  • 3
  • 25
  • 38
0

You can try to use this SQL

-- Table to group by
SELECT Row_number() 
         OVER ( 
           ORDER BY employee_key, effective_start_date, effective_end_date) RN, 
       *, 
       Cast(NULL AS BIGINT) 
       GroupBy 
INTO   #tmp 
FROM   #foo 

-- Variables 
DECLARE @GroupBy BIGINT = 0 
DECLARE @lastEmployee_key BIGINT 
DECLARE @lastAccount_Name VARCHAR(100) 


--- Creating the data group 
UPDATE #tmp 
SET    @GroupBy = groupby = CASE 
                              WHEN employee_key = Isnull(@lastEmployee_key, -1) 
                                   AND account_name = Isnull(@lastAccount_Name, 
                                                      '') 
                                   THEN @GroupBy 
                              ELSE @GroupBy + 1 
                            END, 
       @lastEmployee_key = employee_key = employee_key, 
       @lastAccount_Name = account_name = account_name 


--- Output 
SELECT employee_key, 
       Min(effective_start_date) effective_start_date, 
       Max(effective_end_date)   effective_end_date, 
       account_name 
FROM   #tmp 
GROUP  BY groupby, 
          employee_key, 
          account_name 

The result is

employee_key | effective_start_date|effective_end_date | account_name
-------------| --------------------|------------------ | -------------
1            | 2017-01-01          |2017-01-31         | Google
1            | 2017-02-01          |2017-02-28         | Apple
1            | 2017-03-01          |9999-12-31         | Google
Maurício Pontalti Neri
  • 1,060
  • 1
  • 10
  • 17
  • 1
    This is an incomplete solution. For instance: Your example implies employee 1 worked on the Google account on 2/2/2017. But row two of the original data clearly shows she worked on Apple. – Ryan B. Jul 24 '17 at 17:38
  • Write here the expect result do you want. if you want some think like it: employee_key | account_name | effective_start_date|effective_end_date 1 | Apple | 2017-02-01 |2017-02-28 1 | Google | 2017-01-01 |9999-12-31 the query is select employee_key , account_name , min(effective_start_date) effective_start_date , max(effective_end_date) effective_end_date from #foo group by account_name, employee_key – Maurício Pontalti Neri Jul 24 '17 at 17:41
  • This is incorrect because the span includes time when the employee's account name was Apple (February, 2017). – Stan Shaw Jul 24 '17 at 17:41
  • Please, write here the expected result – Maurício Pontalti Neri Jul 24 '17 at 17:55
  • employee_key Start Date End Date account_name -------------------- ---------- ---------- ---------------------------------------------------------------------------------------------------- 1 2017-01-01 2017-01-31 Google 1 2017-02-01 2017-02-28 Apple 1 2017-03-01 9999-12-31 Google – Stan Shaw Jul 24 '17 at 18:11