1

I have a database with a table called BOOKINGS containing the following values

main-id     place-id     start-date    end-date
1                  1     2018-8-1      2018-8-8
2                  2     2018-6-6      2018-6-9
3                  3     2018-5-5      2018-5-8
4                  4     2018-4-4      2018-4-5
5                  5     2018-3-3      2018-3-10
5                  1     2018-1-1      2018-1-6
4                  2     2018-2-1      2018-2-10
3                  3     2018-3-1      2018-3-28
2                  4     2018-4-1      2018-4-6
1                  5     2018-5-1      2018-5-15
1                  3     2018-6-1      2018-8-8
1                  4     2018-7-1      2018-7-6
1                  1     2018-8-1      2018-8-18
1                  2     2018-9-1      2018-9-3
1                  5     2018-10-1     2018-10-6
2                  5     2018-11-1     2018-11-5
2                  3     2018-12-1     2018-12-25
2                  2     2018-2-2      2018-2-19
2                  4     2018-4-4      2018-4-9
2                  1     2018-5-5      2018-5-23

What I need to do is for each main-id I need to find the largest total number of days for every place-id. Basically, I need to determine where each main-id has spend the most time.

This information must then be put into a view, so unfortunately I can't use temporary tables.

The query that gets me the closest is

CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT `BOOKINGS`.`main-id`, `BOOKINGS`.`place-id`, SUM(DATEDIFF(`end-date`, `begin-date`)) AS `total`
FROM `BOOKINGS`
GROUP BY `BOOKINGS`.`main-id`,`RESERVATION`.`place-id`

Which yields:

main-id            place-id                total

1                   1                     24
1                   2                     18
1                   5                     5
2                   1                     2
2                   2                     20
2                   4                     9
3                   1                     68
3                   2                     24
3                   3                     30
4                   1                     5
4                   2                     10
4                   4                     1
5                   1                     19
5                   2                     4
5                   5                     7

What I need is then the max total for each distinct main-id:

main-id            place-id                total
1                   1                     24
2                   2                     20
3                   1                     68
4                   2                     10
5                   1                     19

I've dug through a large amount of similar posts that recommend things like self joins; however, due to the fact that I have to create the new field total using an aggregate function (SUM) and another function (DATEDIFF) rather than just querying an existing field, my attempts at implementing those solutions have been unsuccessful.

I am hoping that my query that got me close will only require a small modification to get the correct solution.

Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
Time Rift
  • 17
  • 6
  • On a side note: Why do you have overlap of date ranges in booking for same main_id and place_id combination ? – Madhur Bhaiya Nov 29 '18 at 03:36
  • @MadhurBhaiya To be perfectly honest, that is simply a mistake. All of the data in these tables is filler data that I randomly typed up, but it is the data that I have been using to test my code. – Time Rift Nov 29 '18 at 03:40
  • Do you have access to latest MySQL version (8+) ? or can upgrade, if required ? – Madhur Bhaiya Nov 29 '18 at 03:42
  • @MadhurBhaiya I've had MySQL and MySQL Workbench installed for a while, so I am currently running 5.2.70. I can try to upgrade if there is little risk of breaking existing functionality. What did you have in mind? – Time Rift Nov 29 '18 at 03:51
  • Upgrading to latest version is *generally always* better. Mostly backward compatibility is maintained, so you could try upgrading and test it out. – Madhur Bhaiya Nov 29 '18 at 04:07

1 Answers1

1

Having hyphen character - in column name (which is also minus operator) is a really bad idea. Do consider replacing it with underscore character _.

One possible way is to use Derived Tables. One Derived Table is used to determine the total on a group of main id and place id. Another Derived Table is used to get maximum value out of them based on main id. We can then join back to get only the row corresponding to the maximum value.

CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT b1.main_id, b1.place_id, b1.total 
FROM 
(
  SELECT `main-id` AS main_id, 
         `place-id` AS place_id, 
         SUM(DATEDIFF(`end-date`, `begin-date`)) AS total 
  FROM BOOKINGS 
  GROUP BY main_id, place_id
) AS b1
JOIN
(
  SELECT dt.main_id, MAX(dt.total) AS max_total 
  FROM 
  (
    SELECT `main-id` AS main_id, 
           `place-id` AS place_id, 
           SUM(DATEDIFF(`end-date`, `begin-date`)) AS total 
    FROM BOOKINGS 
    GROUP BY main_id, place_id
  ) AS dt 
  GROUP BY dt.main_id 
) AS b2 
  ON b1.main_id = b2.main_id AND 
     b1.total = b2.max_total 

MySQL 8+ solution would be utilizing the Row_Number() functionality:

CREATE VIEW `MOSTTIME` (`main-id`,`place-id`,`total`) AS
SELECT b.main_id, b.place_id, b.total 
FROM 
(
  SELECT dt.main_id, 
         dt.place_id, 
         dt.total 
         ROW_NUMBER() OVER (PARTITION BY dt.main_id 
                            ORDER BY dt.total DESC) AS row_num                            
  FROM 
  (
    SELECT `main-id` AS main_id, 
           `place-id` AS place_id, 
           SUM(DATEDIFF(`end-date`, `begin-date`)) AS total 
    FROM BOOKINGS 
    GROUP BY main_id, place_id
  ) AS dt 
  GROUP BY dt.main_id 
) AS b
WHERE b.row_num = 1
Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
  • 1
    That makes so much sense! One of the things that I was trying to do in my initial research was how to use the result of a select as the source of data for my next query (i.e. after I compute the total number of days for each main-id and place-id combo, then I select the highest for each main-id) and this does that eloquently! – Time Rift Nov 29 '18 at 04:10