4

I need some help writing a mysql query. I need to find all missed reports for a given date range for a specific business id / project id.

Basically, for a given business id, I need to know the name of the project and all dates that a report is either missing or not marked as completed.

I am using the calendar table trick (as described here and here) to find the missing report dates, but I am having problems joining the projects table to find the associated project / business that a report was missed for.

I basically need a result set that will give me data similar to this:

+------------+-----------+--------------+
| project_id | name      | missing_date |
+------------+-----------+--------------+
| 1          | Project 1 | 2014-01-01   |
| 1          | Project 1 | 2014-01-03   |
| 1          | Project 1 | 2014-01-04   |
| 1          | Project 1 | 2014-01-07   |
| 1          | Project 1 | 2014-01-09   |
| 2          | Project 2 | 2014-01-02   |
| 2          | Project 2 | 2014-01-03   |
| 2          | Project 2 | 2014-01-04   |
+------------+-----------+--------------+

Here is my schema:

projects table:
+----------------+------------------+------+-----+-------------------+----------------+
| Field          | Type             | Null | Key | Default           | Extra          |
+----------------+------------------+------+-----+-------------------+----------------+
| project_id     | int(10) unsigned | NO   | PRI | NULL              | auto_increment |
| business_id    | int(10) unsigned | NO   | MUL | NULL              |                |
| name           | tinytext         | YES  |     | NULL              |                |
+----------------+------------------+------+-----+-------------------+----------------+

reports table:
+---------------------+------------------+------+-----+-------------------+----------------+
| Field               | Type             | Null | Key | Default           | Extra          |
+---------------------+------------------+------+-----+-------------------+----------------+
| report_id           | int(10) unsigned | NO   | PRI | NULL              | auto_increment |
| project_id          | int(10) unsigned | NO   | MUL | NULL              |                |
| report_date         | date             | NO   | MUL | NULL              |                |
| completed           | bit(1)           | NO   |     | b'0'              |                |
+---------------------+------------------+------+-----+-------------------+----------------+


calendar table:
+--------------+-------------+------+-----+---------+-------+
| Field        | Type        | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+-------+
| dt           | date        | NO   | PRI | NULL    |       |
| month_name   | varchar(9)  | YES  |     | NULL    |       |
| day_name     | varchar(9)  | YES  |     | NULL    |       |
| y            | smallint(6) | YES  |     | NULL    |       |
| q            | tinyint(4)  | YES  |     | NULL    |       |
| m            | tinyint(4)  | YES  |     | NULL    |       |
| d            | tinyint(4)  | YES  |     | NULL    |       |
| dw           | tinyint(4)  | YES  |     | NULL    |       |
| w            | tinyint(4)  | YES  |     | NULL    |       |
| is_weekday   | bit(1)      | YES  |     | NULL    |       |
| is_holiday   | bit(1)      | YES  |     | NULL    |       |
| holiday_desc | varchar(32) | YES  |     | NULL    |       |
+--------------+-------------+------+-----+---------+-------+

The following query below works to return a list of incompleted reports, but I still need to fill the gaps in with the dates where there is no report record at all.

select 
    p.project_id,
    p.name,
    c.dt as missing_date,
    r.completed
from reports r
join projects p on (r.project_id = p.project_id)
right join calendar c on (c.dt = r.report_date)
where c.dt >= '2014-02-01'
and c.dt <= '2014-02-10'
-- and r.report_date is null /** THE RESULT SET IS EMPTY IF I UNCOMMENT THIS **/
and r.completed = false
and c.is_holiday = false
and c.is_weekday = true
and p.business_id = 1001
order by p.project_id, r.report_date, c.dt;

Any help would be much appreciated!

Community
  • 1
  • 1
  • What is the result of your query and why do you make a right join to the calendar table? – Christian Ammer Feb 27 '14 at 19:45
  • I get an empty result set. I need to join the calendar table so I can determine all of the dates that reports should have been created on. – elprogrammador Feb 27 '14 at 20:02
  • I don't see the need for the **right** join, why include all the *calendar* records without matching *reports* record? – Christian Ammer Feb 27 '14 at 21:53
  • You can try to split the query and check the result of the subqueries: A: "calendar without joins and only with date", B: "calendar without join and full where statement", C: "reports join calendar", ... – Christian Ammer Feb 27 '14 at 21:54

2 Answers2

0

I read the links you have provided and this method of storage for dates is great. any way in your query you used:

right join calendar c on (c.dt = r.report_date)

but I don't see any report_date in your report table. I suggest you check it if the problem is because of it ,since apart from that your query seems to work properly.

Shirin Feiz
  • 73
  • 1
  • 6
  • Sorry, the report table does contain the report_date column. I accidentally removed it while posting the sample schema. I've edited the sample schema to show it is in there now. – elprogrammador Feb 27 '14 at 19:39
0

Ok, I finally got it to work. It's a fairly complicated query that requires an inner join on the projects table but I don't know a better way to do it - I'm a Java guy who relies on hibernate way too much these days to build my queries :). If anyone has a more efficient solution I am all ears!

The final query:

select 
    p.project_id,
    p.name,
    c.dt as missing_date,
    r.report_date,
    r.completed
from calendar c
inner join (
    select 
        p1.project_id,
        p1.name
    from projects p1
    where p1.project_id = 1005
    -- where p1.business_id = 1001 /** OR USE THE BUSINESS ID **/

) p on c.dt between '2014-02-01' and '2014-02-28'
left join reports r on r.report_date = c.dt
and r.project_id = p.project_id
and r.completed = false
where (r.report_date is null or r.completed = false)
and c.is_holiday = false
and c.is_weekday = true
order by p.project_id, c.dt;

Produces the correct results:

+------------+--------------+--------------+-------------+-----------+
| project_id | name         | missing_date | report_date | completed |
+------------+--------------+--------------+-------------+-----------+
|       1005 | Project 1005 | 2014-02-03   | 2014-02-03  |         0 |
|       1005 | Project 1005 | 2014-02-04   | 2014-02-04  |         0 |
|       1005 | Project 1005 | 2014-02-05   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-06   | 2014-02-06  |         0 |
|       1005 | Project 1005 | 2014-02-07   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-10   | 2014-02-10  |         0 |
|       1005 | Project 1005 | 2014-02-11   | 2014-02-11  |         0 |
|       1005 | Project 1005 | 2014-02-12   | 2014-02-12  |         0 |
|       1005 | Project 1005 | 2014-02-13   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-14   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-18   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-19   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-20   | 2014-02-20  |         0 |
|       1005 | Project 1005 | 2014-02-21   | 2014-02-21  |         0 |
|       1005 | Project 1005 | 2014-02-24   | 2014-02-24  |         0 |
|       1005 | Project 1005 | 2014-02-25   | 2014-02-25  |         0 |
|       1005 | Project 1005 | 2014-02-26   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-27   | NULL        |      NULL |
|       1005 | Project 1005 | 2014-02-28   | NULL        |      NULL |
+------------+--------------+--------------+-------------+-----------+

Thanks for the help guys and gals!