8

I need to create a row for every month (result should be first day of the month) between 2 dates for each person in my table. For example, if I have the following data in my source table:

rowID | person      | startdate   | enddate
1     | 12345       | 2014-04-01  | 2014-11-30
2     | 67890       | 2014-03-01  | 2014-05-01

I want the results in my destination table to be:

person | month
12345  | 2014-04-01
12345  | 2014-05-01
12345  | 2014-06-01
12345  | 2014-07-01
12345  | 2014-08-01
12345  | 2014-09-01
12345  | 2014-10-01
12345  | 2014-11-01
67890  | 2014-03-01
67890  | 2014-04-01
67890  | 2014-05-01

Thanks so much for the help.

Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
BDad
  • 95
  • 1
  • 5

2 Answers2

11

No need for a CTE or a lateral join:

select
    person,
    generate_series(
        date_trunc('month', startdate), 
        enddate, '1 month'
    )::date as month
from rfem
order by 1, 2
;
 person |   month    
--------+------------
  12345 | 2014-04-01
  12345 | 2014-05-01
  12345 | 2014-06-01
  12345 | 2014-07-01
  12345 | 2014-08-01
  12345 | 2014-09-01
  12345 | 2014-10-01
  12345 | 2014-11-01
  67890 | 2014-03-01
  67890 | 2014-04-01
  67890 | 2014-05-01
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • 1
    Using a set returning function in the `select` list is discouraged. So there **is** indeed a need for a lateral join. –  Dec 20 '16 at 14:29
  • @a_horse_with_no_name That discouragement is intended for multiple set returning functions. – Clodoaldo Neto Dec 20 '16 at 14:35
  • @a_horse_with_no_name Can you give a reason why a set returning function in the select list is discouraged? Or a link that explains it? – Gregory Arenius Sep 22 '17 at 18:22
2

Calculate minimum and maximum dates for each person with first days of months then generate monthly-based ranges between those dates using generate_series:

WITH date_ranges AS (
SELECT 
  person,
  min(date_trunc('month', startdate))::timestamptz AS min_start,
  max(date_trunc('month', enddate))::timestamptz AS max_end
FROM person_table
GROUP BY 1
)
SELECT 
  dr.person,
  ser.month::DATE as month
FROM date_ranges AS dr,
     generate_series(min_start, max_end, '1 month') AS ser(month)

Output

 person |   month
--------+------------
  12345 | 2014-04-01
  12345 | 2014-05-01
  12345 | 2014-06-01
  12345 | 2014-07-01
  12345 | 2014-08-01
  12345 | 2014-09-01
  12345 | 2014-10-01
  12345 | 2014-11-01
  67890 | 2014-03-01
  67890 | 2014-04-01
  67890 | 2014-05-01

How it works? Implicit LATERAL JOIN for function call forces the calculation for every row from the input.

This solution takes into account that you may have more than 1 row per each person with dates and it takes the highest possible range there is.

Kamil Gosciminski
  • 16,547
  • 8
  • 49
  • 72