0

I'm trying to sum a certain column over a certain date range. The kicker is that I want this to be a CTE, because I'll have to use it multiple times as part of a larger query. Since it's a CTE, it has to have the date column as well as the sum and ID columns, meaning I have to group by date AND ID. That will cause my results to be grouped by ID and date, giving me not a single sum over the date range, but a bunch of sums, one for each day.

To make it simple, say we have:

create table orders (  
 id int primary key,  
 itemID int foreign key references items.id,  
 datePlaced datetime,  
 salesRep int foreign key references salesReps.id,  
 price int,  
 amountShipped int);

Now, we want to get the total money a given sales rep made during a fiscal year, broken down by item. That is, ignoring the fiscal year bit:

select itemName, sum(price) as totalSales, sum(totalShipped) as totalShipped
 from orders
 join items on items.id = orders.itemID
 where orders.salesRep = '1234'
 group by itemName

Simple enough. But when you add anything else, even the price, the query spits out way more rows than you wanted.

    select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
     from orders
 join items on items.id = orders.itemID
 where orders.salesRep = '1234'
 group by itemName, price

Now, each group is (name, price) instead of just (name). This is kind of sudocode, but in my database, just this change causes my result set to jump from 13 to 32 rows. Add to that the date range, and you really have a problem:

select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped  
 from orders  
 join items on items.id = orders.itemID  
 where orders.salesRep = '1234'  
  and orderDate between 150101 and 151231  
 group by itemName, price  

This is identical to the last example. The trouble is making it a CTE:

with totals as (  
 select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped, orderDate as startDate, orderDate as endDate  
  from orders  
   join items on items.id = orders.itemID  
  where orders.salesRep = '1234'  
   and orderDate between startDate and endDate  
 group by itemName, price, startDate, endDate  
)  
select totals_2015.itemName as itemName_2015, totals_2015.price as price_2015, ...  
 totals_2016.itemName as itemName_2016, ...  
 from (  
 select * from totals  
 where startDate = 150101 and endDate = 151231  
) totals_2015  
 join (  
 select *  
 from totals  
 where startDate = 160101 and endDate = 160412  
) totals_2016  
on totals_2015.itemName = totals_2016.itemName  

Now the grouping in the CTE is way off, more than adding the price made it. I've thought about breaking the price query into its own subquery inside the CTE, but I can't escape needing to group by the dates in order to get the date range. Can anyone see a way around this? I hope I've made things clear enough. This is running against an IBM iSeries machine. Thank you!

AH16
  • 337
  • 1
  • 2
  • 13
  • Why do you think `Since it's a CTE, it has to have the date column` is true? – Dan Bracuk Apr 12 '16 at 16:57
  • Sorry, I didn't make that clear. The CTE will be used to get multiple years, as in the last example. This way, the same CTE can be used to get two, three, four, or more years simply by giving it different dates. This will be outputted to a table on a website, with one row per itemName and as many columns as years, times the count of (price, totalShipped, totalSales). – AH16 Apr 12 '16 at 17:01
  • Maybe a temporary table or view is a better approach than a cte. By the way, you are more likely to receive help if you take the time and effort to format your question properly, ie, format your code as code. – Dan Bracuk Apr 12 '16 at 17:08
  • If I use a temporary table, wouldn't I still have the problem of having to group by dates? As mentioned, the idea is that this "totals" will be used two or more times in a single query. As to formatting, thanks for the reminder. That should now look better. – AH16 Apr 12 '16 at 17:26
  • A temporary table with both the date, and the fiscal year strikes me as a good approach. – Dan Bracuk Apr 12 '16 at 18:04
  • I'm still not sure how a temporary table will eliminate the problem, though. The question boils down to how I can select the sum of a column for a given date range, while not grouping by that date range. Perhaps I'm just not picturing this correctly? What might the temporary table look like? – AH16 Apr 12 '16 at 18:15

3 Answers3

0

Depending on what you are looking for, this might be a better approach:

select 'by sales rep' breakdown
, salesRep
, '' year
, sum(price * amountShipped) amount
from etc
group by salesRep

union

select 'by sales rep and year' breakdown
, salesRep
, convert(char(4),orderDate, 120)  year
, sum(price * amountShipped) amount
from etc
group by salesRep, convert(char(4),orderDate, 120)

etc
Dan Bracuk
  • 20,699
  • 4
  • 26
  • 43
  • Sorry, but I don't follow the example. How would I limit the sum to a certain date range, and wouldn't this give me duplicate rows? Or at least, two rows per result? – AH16 Apr 12 '16 at 20:25
0

When possible group by the id columns or foreign keys because the columns are indexed already you'll get faster results. This applies to any database.

with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere 
where date between x and y
group by id,rep
) select * from cte order by rep

or more fancy

with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere 
where date between x and y
group by id,rep
) select * from cte join reps on cte.rep = reps.rep order by sls desc
danny117
  • 5,581
  • 1
  • 26
  • 35
  • Thanks for the answer, but you hard-coded the dates in the CTE. My problem is that I want to be able to use the CTE to get these sums between any two dates. That means putting two date columns in the select, since the CTE can only be given dates by selecting dates, and that means grouping by dates. Unless I've badly misunderstood something (very possible)? – AH16 Apr 13 '16 at 19:25
  • You fill in the x and y at run time You can group by id, rep ,year(date) ,month(date) in the CTE block . Go for it you can do it. But always remember group by something that has a key built over it as the first elelement in the group by clause and your queries will run fast – danny117 Apr 14 '16 at 14:05
  • I'm using a CTE to avoid adjusting at runtime. For instance, I want to call it twice, once for one date range and again for a different one. Or this might end up serving a webpage, so the dates would have to come from a form. Using a form would be easy since I could just replace the values, but I still couldn't use it more than once. – AH16 Apr 14 '16 at 16:21
  • Take out the x and y and add a year(date) , month(date) to the group by and selected columns. Go for it. – danny117 Apr 14 '16 at 19:42
  • Which would group by month, which is fine. Where I'm still not seeing the answer is how the CTE gets the date range passed into it. If I wanted April-July, I'd have to tell the CTE that somehow. The only way is to say "where cte.date between x and y", but that means putting a "date" in the CTE's select. That, then, means I'd have to group by the date column, not just the year and month of the date. Or am I wrong? I'm probably wrong. – AH16 Apr 14 '16 at 21:54
0

I eventually found a solution, and it doesn't need a CTE at all. I wanted the CTE to avoid code duplication, but this works almost as well. Here's a thread explaining summing conditionally that does exactly what I was looking for.

AH16
  • 337
  • 1
  • 2
  • 13