6

It's important to know that the date will be unknown during the query time, so I cannot just hard code a 'WHERE' clause.

Here's my table:

+-----------+----------+-------------+
| Date_ID   | Customer | Order_Count |
+-----------+----------+-------------+
| 20150101  | Jones    | 6           |
| 20150102  | Jones    | 4           |
| 20150103  | Jones    | 3           |
+-----------+----------+-------------+

Here's the desired output:

+-----------+----------+------------------+
| Date_ID   | Customer | SUM(Order_Count) |
+-----------+----------+------------------+
| 20150101  | Jones    | 6                |
| 20150102  | Jones    | 10               |
| 20150103  | Jones    | 13               |
+-----------+----------+------------------+

My guess is I need to use a variable or perhaps a join.

Edit: still not able to get it fast enough. very slow.

jpw
  • 44,361
  • 6
  • 66
  • 86
Danny W
  • 463
  • 2
  • 6
  • 12

4 Answers4

4

Try this query; it's most likely the best you can do without limiting the dataset you operate on. It should benefit from an index (customer, date_id).

select 
  t1.date_id, t1.customer, sum(t2.order_count)
from 
  table1 t1
left join 
  table1 t2 on t1.customer = t2.customer
           and t1.date_id >= t2.date_id
group by 
  t1.date_id, t1.customer;

Sample SQL Fiddle.

jpw
  • 44,361
  • 6
  • 66
  • 86
2

One way you could go about it is by using a sub query which sums all orders up till the current order. Probably not the fastest way, but it should do the trick.

SELECT `Date_ID`, `Customer`, 
    (SELECT sum(b.`Order_Count`) 
    FROM tablename as b WHERE 
    b.`Date_ID` <= a.`Date_ID` AND
    a.`customer = b.`Customer`) 
FROM tablename as a
Samon
  • 101
  • 5
  • I have a large dataset, and the subquery is super slow. Also I think I needed a group by clause in your example, as there are actually different Customer names. – Danny W Jul 31 '15 at 16:27
  • Added the customer to the subquery, but it indeed isn't very fast as it requires recalculating the full sum for each element. Indexing should help some. If it still is too slow you might want to consider calculating the sum in your program outside SQL. – Samon Aug 02 '15 at 16:39
1

Where performance is an issue, consider a solution akin the following:

SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+

SELECT i,@i:=@i+i FROM ints, (SELECT @i:=0)n ORDER BY i;
+---+----------+
| i | @i:=@i+i |
+---+----------+
| 0 |        0 |
| 1 |        1 |
| 2 |        3 |
| 3 |        6 |
| 4 |       10 |
| 5 |       15 |
| 6 |       21 |
| 7 |       28 |
| 8 |       36 |
| 9 |       45 |
+---+----------+
Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • Has potential, but so far this sums all ID's regardless of the customer name...And a group by doesn't effect it either...Each customer should be summed independently. – Danny W Aug 02 '15 at 09:05
  • Well I guess that's the difference between choosing to provide proper(ly representative) DDLs / an sqlfiddle, or not. – Strawberry Aug 02 '15 at 09:17
  • Using @ assignments in a `SELECT` is likely to be disallowed in 8.0.xx; use a "windowing" function _if_ you have 8.0 or 10.2. – Rick James Dec 08 '21 at 22:46
0

you can consider this solution select Date_ID, Customer, SUM(Order_COunt) over (order by Date_ID, Customer rows unbounded preceding) as SUM(Order_COunt) from table

Hai Do
  • 23
  • 3