1

I have two tables - let's call them week and contract, like so:

Week                                    Contract    
emp_id | starting   | data1 |  ...   emp_id | from_date  | data2 | ...
-------|------------|-------|--      -------|------------|-------|--
12     | 2019-01-08 | abcd  |        12     | 2018-08-01 | efgh  | 
12     | 2019-01-15 | abcd  |        13     | 2018-10-02 | efgh  | 
12     | 2019-01-22 | abcd  |        13     | 2019-01-15 | ijkl  | 
13     | 2019-01-08 | abcd  |        13     | 2019-03-19 | mnop  | 
13     | 2019-01-15 | abcd  |        14     | 2017-02-02 | efgh  | 
13     | 2019-01-22 | abcd  |        15     | 2018-01-19 | efgh  | 

The week.starting field is a date (datetime with time set to midnight) that's at fairly regular intervals. The specific combination of (emp_id, starting) is unique. The from_date field is also a date that records the starting period where the contract record applies. This can be in the future, so we can't just do a MAX(from) and get the right contract for each employee. Currently, (emp_id, from_date) is unique, but I don't want to rely on this. week.starting and contract.from_date can be identical.

I want a query that returns the entire week record, and for each, the record for whatever contract was active at the time - that is, the record where from_date is largest while still less than or equal to week.starting. Getting this contract if I have a specific week in mind is a pretty straightforward greatest n-per-group problem:

SELECT * FROM contract 
WHERE contract.emp_id = @emp_id AND contract.from_date <= @starting
ORDER BY contract.from_date DESC
LIMIT 1

But I can't work out how to do this as part of a query to get every record in week. My specific combination of obstacles means I haven't been able to find an answer, despite this being a common set of problems. I don't appear to be able to pass week.starting into a subquery, and I don't appear to be able to use LIMIT in a join either. My best attempt so far ended up joining all contracts that were less than a given week.

What query will return the result I'm looking for?

emp_id | starting   | from_date  | data1 | data2 | ...
-------|------------|------------|-------|-------|--
12     | 2019-01-08 | 2018-08-01 | abcd  | efgh  |
12     | 2019-01-15 | 2018-08-01 | abcd  | efgh  |
12     | 2019-01-22 | 2018-08-01 | abcd  | efgh  |
13     | 2019-01-08 | 2018-10-02 | abcd  | efgh  |
13     | 2019-01-15 | 2019-01-15 | abcd  | ijkl  |
13     | 2019-01-22 | 2019-01-15 | abcd  | ijkl  |
Merus
  • 8,796
  • 5
  • 28
  • 41
  • On what bases you are giving condition for eg: for emp_id = 13 13 --> 2019-01-08 ---> 2018-10-02 13 --> 2019-01-15 --> 2019-01-15 13 --> 2019-01-22 --> 2019-01-15 – Ajay May 29 '19 at 06:09
  • sorry, I made a typo in the example table. Some dates should have been 2019, not 2018. – Merus May 29 '19 at 06:14

2 Answers2

2

You should be able to use a window function to order the contracts in time after filtering future contracts and assign a rank. Then you can pick the latest with the top rank.

Have not tested this but should look something like:

select * from (
    select w.*, c.from_date, c.data2,
        row_number() over (partition by c.emp_id, w.starting order by c.from_date desc) as latest
    from week w
    join contract c on c.emp_id = w.emp_id and c.from_date <= w.starting
) as sub where latest = 1
systemjack
  • 2,815
  • 17
  • 26
  • You can use first_value(data2) instead of row_number() to eliminate the need for the outer select clause, see [official documentation](https://www.postgresql.org/docs/current/functions-window.html). – Marco Borchert May 29 '19 at 06:31
  • First_value applies that result to each row in the window frame. Then you'd still have to check where first_value=from_date. It does not do the filtering for you. That might introduce dupes as well since could match more than once. – systemjack May 29 '19 at 06:40
  • 1
    You're right. If you use `partition by c.emp_id, w.starting` you get the exact output OP was expecting (contract by employee for each week). – Marco Borchert May 29 '19 at 07:17
  • Wanted to point out, in some cases returning all the matching rows using first_value might be desirable. – systemjack May 29 '19 at 07:24
  • I've just tried this on our dev database and it's working beautifully. – Merus Jun 06 '19 at 00:55
1

In Postgres, you can use a lateral join:

select w.*, c.*
from weeks w left join lateral
     (select c.*
      from contract c
      where c.emp_id = w.emp_id and
            c.from_date <= w.starting
      order by c.from_date desc
      fetch first 1 row only
     ) c;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786