2

I have a dataset similar to the following:

CREATE TABLE revenue (
    employee_id int,
    job_id int,
    date date,
    revenue numeric (9,2)
);

INSERT INTO revenue (employee_id, job_id, date, revenue) VALUES (1, 123, '2018-07-16', 100);
INSERT INTO revenue (employee_id, job_id, date, revenue) VALUES (1, 124, '2018-07-17', 100);
INSERT INTO revenue (employee_id, job_id, date, revenue) VALUES (2, 125, '2018-07-14', 100);
INSERT INTO revenue (employee_id, job_id, date, revenue) VALUES (2, 126, '2018-07-16', 100);

I wrote the following query and have the following output:

SELECT employee_id, generate_series::date, SUM(revenue) as revenue
FROM generate_series('2018-07-14'::date, '2018-07-17'::date, '1 day')
LEFT JOIN revenue ON revenue.date = generate_series::date
GROUP BY employee_id, generate_series::date
ORDER BY employee_id, generate_series::date;

+---------------+-------------------+-----------+
| employee_id   | generate_series   | revenue   |
|---------------+-------------------+-----------|
| 1             | 2018-07-16        | 100.00    |
| 1             | 2018-07-17        | 100.00    |
| 2             | 2018-07-14        | 100.00    |
| 2             | 2018-07-16        | 100.00    |
| <null>        | 2018-07-15        | <null>    |
+---------------+-------------------+-----------+
SELECT 5
Time: 0.010s
super> 

But my objective is to have all dates for all employees even when there is no revenue.

Here is my desired output:

+---------------+-------------------+-----------+
| employee_id   | generate_series   | revenue   |
|---------------+-------------------+-----------|
| 1             | 2018-07-14        |   0.00    |
| 1             | 2018-07-15        |   0.00    |
| 1             | 2018-07-16        | 100.00    |
| 1             | 2018-07-17        | 100.00    |
| 2             | 2018-07-14        | 100.00    |
| 2             | 2018-07-15        |   0.00    |
| 2             | 2018-07-16        | 100.00    |
| 2             | 2018-07-17        |   0.00    |
+---------------+-------------------+-----------+

Anybody knows how to achieve that?

Gab
  • 3,404
  • 1
  • 11
  • 22

1 Answers1

2

Use a cross join to generate the rows and then left join to bring in the data:

select e.employee_id, gs.dte, coalesce(revenue, 0) as revenue
from generate_series('2018-07-14'::date, '2018-07-17'::date, '1 day') gs(dte) cross join
     (select distinct employee_id from revenue) e left join
     revenue r
     on r.employee_id = e.employee_id and r.date = gs.dte
order by employee_id, gs.date;

I don't think the group by is necessary. You only have at most one row for each employee and date.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786