1

I've been trying to calculate year over year growth for monthly returns and have been rewriting the same queries for hours with no luck. I've seen solutions but they're all other database solutions.

I'm trying to basically achieve something like the following:

Year over Year

And this is the query I've built, although I've never actually had it complete running (been running 15+ minutes) due to the sub-query runs per-row.

This table has 2m+ rows with good indexes, it's fast but subqueries kill it.

Could be totally wrong approach, but this is what I've got.

SELECT
    YEAR(thisyear.trandte) AS `Year`,
    MONTH(thisyear.trandte) AS `YearMonth`,
    SUM(lastyear.totamount) AS LastYearSales,
    SUM(thisyear.totamount) AS ThisYearSales
FROM
    sync_invoice_lines thisyear
LEFT JOIN
    sync_invoice_lines lastyear ON
        MONTH(thisyear.trandte) = (MONTH(lastyear.trandte)) AND
        YEAR(thisyear.trandte) = (YEAR(lastyear.trandte) - 1)
WHERE
    thisyear.type = 'IN' AND
    lastyear.type = 'IN' AND
    thisyear.sync_active = 1 AND
    lastyear.sync_active = 1 AND
GROUP BY `Year`, `YearMonth`
GMB
  • 216,147
  • 25
  • 84
  • 135
Jonathan Bird
  • 313
  • 5
  • 19
  • Your current query is very inefficient and would return totally wrong numbers. You join all rows with the same year/month, i.e. if you got 1000 rows for 2020 July you get 1000*1000 rows after the join before the SUM. – dnoeth Aug 29 '20 at 19:54

5 Answers5

2

Provided that you have data in the table for all months without any gaps, then all you need is the window function LAG() to fetch last year's sum of totamount for the same month:

SELECT YEAR(trandte) AS Year, 
       MONTH(trandte) AS Month, 
       SUM(totamount) AS ThisYearSales,
       LAG(SUM(totamount), 12) OVER (ORDER BY YEAR(trandte), MONTH(trandte)) AS LastYearSales
FROM sync_invoice_lines 
WHERE type = 'IN' AND sync_active = 1
GROUP BY Year, Month

If there are gaps between the months then create a CTE from the above query and do a LEFT self join on it:

WITH cte AS (
  SELECT YEAR(trandte) AS Year, 
         MONTH(trandte) AS Month, 
         SUM(totamount) AS Sales
  FROM sync_invoice_lines 
  WHERE type = 'IN' AND sync_active = 1
  GROUP BY Year, Month
)
SELECT c1.Year, 
       c1.Month,
       c1.Sales AS ThisYearSales,
       c2.Sales AS LastYearSales 
FROM cte c1 LEFT JOIN cte c2
ON c2.Year = c1.Year - 1 AND c2.Month = c1.Month
forpas
  • 160,666
  • 10
  • 38
  • 76
1

You can do that in a single table scan (without a join or CTE), and take in account possible missing month. For this, use lag() with a range clause that precisely targets the same month last year, like so:

select
    year(trandte) as `year`,
    month(trandte) as `yearmonth`,
    lag(sum(totamount)) over(
        order by concat(year(trandte), '-', month(trandte), '-01') 
        range between interval 1 year preceding and interval 1 year preceding
    ) as lastyearsales,
    sum(totamount) as thisyearsales
from sync_invoice_lines
where type = 'IN' and sync_active = 1
group by year(trandte), month(trandte)
order by year(trandte), month(trandte)
GMB
  • 216,147
  • 25
  • 84
  • 135
  • This looks like a good solution, however generates an error: `Window '' with RANGE N PRECEDING/FOLLOWING frame requires exactly one ORDER BY expression, of numeric or temporal type`. I'll give it a crack at working it out - just new to LAG & OVER functions. – Jonathan Bird Sep 01 '20 at 02:18
1

You can calculate the total separately for last year sales and this year sales with the CASE expression. This is very simple.

The query is as follows:

SELECT
    YEAR(CURRENT_DATE) AS `Year`,
    MONTH(trandte) AS `YearMonth`,
    SUM(CASE YEAR(trandte) WHEN YEAR(CURRENT_DATE)-1 THEN totamount END) AS LastYearSales,
    SUM(CASE YEAR(trandte) WHEN YEAR(CURRENT_DATE) THEN totamount END) AS ThisYearSales
FROM
    sync_invoice_lines
WHERE type = 'IN' AND sync_active = 1
GROUP BY `YearMonth`
ORDER BY `YearMonth`;

DB Fiddle

You can specify any year in the YEAR(CURRENT_DATE) part.

etsuhisa
  • 1,698
  • 1
  • 5
  • 7
0

You can use pivot to display your sales for each month by year.

with monthly_sales as  
(SELECT
    YEAR(trandte) AS year,
    MONTH(trandte) AS month,
    SUM(totamount) AS sales
FROM
    sync_invoice_lines 
WHERE
    type = 'IN' AND
    sync_active = 1
GROUP BY YEAR(trandte), MONTH(trandte)) 

Select * from 
(select month, year from monthly_sales)
pivot
(sum(sales) 
for month in (2013, 2014, 2015)
)
order by month
-1

Step 1: Calculate all monthly subtotals, no yr-over-yr yet:

SELECT  LEFT(trandte, 7) AS yyyy_mm,
        SUM(totamount) AS sales
    FROM sync_invoice_lines
    WHERE ...
    GROUP BY 1;

First, see whether this gets the correct numbers, though not in the desired order. And see how fast it goes.

That may be all you need.

Step 2: This will be working with about 30 rows, so efficiency is not a problem. Either put the above stuff in another table, or, since you have MySQL 8.0 (or MariaDB 10.2), use it in a WITH to do the rest of the work. Step 2 is probably to compute the year over year, using a self-join.

Step 3: The order of the output -- Or does the graphic package rearrange the data to get 12 sets of multiple years?

In the long run, consider building and maintaining a "Summary table", perhaps of daily subtotals. It would be like step 1, but thousands of rows, not millions or dozens. With that, you could calculate monthly amounts quite fast. Or weekly. Or other ranges. This way, the bulky task (step 1) is built in daily chunks, which will be a thousand times as fast.

More on subtotals: http://mysql.rjweb.org/doc.php/summarytables

Rick James
  • 135,179
  • 13
  • 127
  • 222