1
SELECT  a.sectionid as sectionid, views.timestamp as timestamp,
        views.time_on_page as time_on_page, views.video as video,
        views.access as access, masterid
    FROM (
        SELECT  m.timestamp as timestamp, m.time_on_page as time_on_page,
                AES_DECRYPT(m.IP_Blob, UNHEX(SHA2('Jove Is Cool',512))) as ip_bin,
                m.page as page, m.language as language, m.access as access,
                m.referrer as referrer, m.searchterms as searchterms,
                m.user_id as user_id,
                case when m.page regexp '/pdf(-materials)?/'
                     then replace(replace(replace(RIGHT(m.page, 5), '/', ''), 'd', ''), 'f', '')
                     else m.video end as video,
                m.id as masterid
            FROM  masterstats_innodb as m
            join  stats_to_institution as sti  ON m.id = sti.statid
            JOIN  institutions as ins  ON sti.institutionid = ins.institutionid
            WHERE  m.timestamp BETWEEN '2019-12-30' AND '2020-12-29'
              and  ( ins.institutionid = '2'
                  or ins.parentinstitutionid = '2' ) 
        ) as views
    LEFT JOIN  articles as a  ON views.video = a.productid
       `enter code here`
    order by masterid

masterstats_innodb -- 115759655 stats_to_institution -- 45317238 institutions --45038

All possible indexes have been created and partition also created for masterstats_innodb table.

  • How many rows in masterstats_innodb would be between those two dates? I imagine that's going to be a large amount? How many would match the `institutionid` or `parentinstitutionid` filters? And when you combine all the filters, how many would you expect there to be? What have you partitioned the table by? – Andrew Sayer Jan 04 '21 at 11:52
  • Please provide `SHOW CREATE TABLE` for each table. – Rick James Jan 05 '21 at 20:42

1 Answers1

1

OR does not optimize well; switching to UNION is a common fix. But it is unclear whether that will help. Which is more selective? m.timestamp or (ins.institutionid = '2' or ins.parentinstitutionid = '2')?

For optimal indexes on stats_to_institution: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

enter code here does not make sense where it is; please fill in with an example.

Tentative INDEXes:

a:  (productid, sectionid)
ins:  (parentinstitutionid, institutionid)
m:  (timestamp)
sti:  (statid, institutionid)
sti:  (institutionid, statid)
Rick James
  • 135,179
  • 13
  • 127
  • 222