-1

This is my main table which stores keyword impressions according to dates.

+---------+-----------+------------+-------------+
| ID      | KeywordId | ReportDate | Impressions |
+---------+-----------+------------+-------------+
| 3324620 | 969274443 | 2018-12-21 | 1           |
+---------+-----------+------------+-------------+
| 3334253 | 969274443 | 2018-12-22 | 0           |
+---------+-----------+------------+-------------+
| 3354119 | 969274443 | 2018-12-23 | 20          |
+---------+-----------+------------+-------------+
| 3373097 | 969274443 | 2018-12-24 | 11          |
+---------+-----------+------------+-------------+
| 3392168 | 969274443 | 2018-12-25 | 1           |
+---------+-----------+------------+-------------+

And this is my cumulative sum table which adds up impressions day by day

+----+-----------+------------+-------------+
| ID | keywordId | ReportDate | Impressions |
+----+-----------+------------+-------------+
| 1  | 969274443 | 2018-12-21 | 1           |
+----+-----------+------------+-------------+
| 2  | 969274443 | 2018-12-22 | 1           |
+----+-----------+------------+-------------+
| 3  | 969274443 | 2018-12-23 | 21          |
+----+-----------+------------+-------------+
| 4  | 969274443 | 2018-12-24 | 32          |
+----+-----------+------------+-------------+
| 5  | 969274443 | 2018-12-25 | 33          |
+----+-----------+------------+-------------+

to get the number of impressions between date 2018-12-21 and 2018-12-25, simply I subtract

33 - 1 and get the result 22.

I used the below query to get this result.

select  keyword.keywordId,(second.impressions - first.impressions) as imp from keyword
inner join diffreportkeyword as first on first.keywordId = keyword.keywordId
                                     and first.reportDate = '2018-12-21'
inner join diffreportkeyword as second on second.keywordId = keyword.keywordId
                                      and second.reportDate = '2018-12-25'

but sometimes, there is no data for a specific date and query returns 0 rows.But this means that for a specific date the impressions are 0.

If i change this with left join,

select  keyword.keywordId,(second.impressions - first.impressions) as imp from keyword
left join diffreportkeyword as first on first.keywordId = keyword.keywordId
                                     and first.reportDate = '2018-12-18'
left join diffreportkeyword as second on second.keywordId = keyword.keywordId
                                      and second.reportDate = '2018-12-25'

this works, but diffreportkeyword table has 5million rows and the keyword table has 300k different "keywordId" and left join makes the query extremely slow.

My question is,

how can I make it faster?

thanks

Burc Hasergin
  • 449
  • 1
  • 4
  • 6

1 Answers1

0

This

  on second.keywordId = keyword.keywordId
 and second.reportDate = '2018-12-25'

needs

INDEX(keywordId, reportDate)  -- in either order

Side note: It might be simpler to say

    '2018-12-21' - INTERVAL 3 DAY

instead of

    '2018-12-18'
Rick James
  • 135,179
  • 13
  • 127
  • 222