0

I have a MySQL procedure which counts records in a very large MySQL table called overview_table. More or less the algorithm is the following:

  1. Create a temporary table called step1 which filters down the overview_table to contain records for only 1 week and only the required data.
  2. Go through each record in the newly created step1 table and do a count of 2 distinct dates in the entire overview_table

My query works. However, I am doing an inner SELECT twice. This is ineficient since I could technically retrieve both data points at the same time.

I read a few articles on how to overcome this (for example this: MySQL: Returning multiple columns from an in-line subquery), but my problem is the date filter which I apply inside of the subquery.

How do I re-write this query so that I only query once for the data I am trying to get in the inner select? This means I need to somehow return the following 2 data points at the same time:

  • COUNT(a_date)
  • COUNT(d_date)

Thank you.

CREATE DEFINER=`abc`@`%` PROCEDURE `abc_help`()

BEGIN

# Filter overview table (VERY LARGE!) to contain records for 1 week
CREATE TABLE step1
SELECT 
    wj_id,
    jp_id,
    p_id,
    d_date,
    a_date,
    r_date,
    h_date,
    create_date
FROM overview_table
WHERE create_date BETWEEN '2020-02-01' AND '2020-02-01';

# Go through each record in step 1 and count number of date occurrences for each record.
# a_date and d_date will sometimes by NULL, therefore COUNT() works by counting non NULL values
CREATE TABLE step2
SELECT 
    step1.wj_id,
    -- Historal counter 1
    ( SELECT COUNT(a_date) FROM overview_table AS search WHERE
        search.a_date       < step1.create_date AND
        search.p_id         = step1.p_id AND
        search.jp_id        = step1.jp_id
    ) AS date_a_count,
    -- Historal counter 2
    ( SELECT COUNT(d_date) FROM overview_table AS search WHERE
        search.d_date       < step1.create_date AND
        search.p_id         = step1.p_id AND
        search.jp_id        = step1.jp_id
    ) AS date_d_count
FROM step1
GROUP BY 1;

END

Aivoric
  • 838
  • 2
  • 10
  • 24

1 Answers1

1

You can do it with a LEFT JOIN of step1 to overview_table:

SELECT 
    step1.wj_id,
    COUNT(CASE WHEN search.a_date < step1.create_date THEN 1 END) AS date_a_count,
    COUNT(CASE WHEN search.d_date < step1.create_date THEN 1 END) AS date_d_count
FROM step1 LEFT JOIN overview_table AS search
ON ((search.a_date < step1.create_date) OR (search.d_date < step1.create_date)) AND
   search.p_id = step1.p_id AND
   search.jp_id = step1.jp_id
GROUP BY step1.wj_id;

If you want to count only distinct dates then:

SELECT 
    step1.wj_id,
    COUNT(DISTINCT CASE WHEN search.a_date < step1.create_date THEN search.a_date END) AS date_a_count,
    COUNT(DISTINCT CASE WHEN search.d_date < step1.create_date THEN search.D_date END) AS date_d_count
FROM step1 LEFT JOIN overview_table AS search
ON ((search.a_date < step1.create_date) OR (search.d_date < step1.create_date)) AND
   search.p_id = step1.p_id AND
   search.jp_id = step1.jp_id
GROUP BY step1.wj_id;
forpas
  • 160,666
  • 10
  • 38
  • 76