8

I'm struggling with a query where I need to SUM DISTINCT Rows. There has to be a way to do this... but I'm lost.

Here's what I've got:

SELECT DISTINCT Zipcodes.CountyID,
us_co_est2005_allData.PopEstimate2005, 
us_co_est2005_allData.EstimatesBase2000,
users_link_territory.userID
FROM
Zipcodes Inner Join Users_link_territory ON zipcodes.CountyID = 
Users_link_territory.CountyID Inner Join
us_co_est2005_alldata ON zipcodes.FIPS = us_co_est2005_alldata.State AND zipcodes.code
= us_co_est2005_alldata.County
WHERE (users_link_territory.userid = 4)

This gives me the 34 rows which provide distinct population numbers for each county belonging to userid4, but how would I get the SUM of PopEstimate2005 and EstimatesBase2000?

Something like (but this isn't a legal query):

SELECT DISTINCT Zipcodes.CountyID,
SUM(us_co_est2005_allData.PopEstimate2005) AS Population2005, 
SUM(us_co_est2005_allData.EstimatesBase2000) AS Population2000,
users_link_territory.userID
FROM
Zipcodes Inner Join Users_link_territory ON zipcodes.CountyID = 
Users_link_territory.CountyID Inner Join
us_co_est2005_alldata ON zipcodes.FIPS = us_co_est2005_alldata.State AND zipcodes.code
= us_co_est2005_alldata.County
WHERE (users_link_territory.userid = 4)
GROUP BY users_link_territory.userid

Of course, as soon as I add Zipcodes.CountyID to the end of the GroupBy, I'm back with my 34 rows again.

Thanks so much for any help.

Russell Schutte . . . . .

After getting the below help - in particular Robb's help - I was able to get what I really wanted - a total of each UserID's population details in a single query:

SELECT     SUM(POPESTIMATE2005) AS Expr1, SUM(ESTIMATESBASE2000) AS Expr2, UserID
FROM         (
    SELECT DISTINCT zipcodes.CountyID, us_co_est2005_alldata.POPESTIMATE2005, us_co_est2005_alldata.ESTIMATESBASE2000, users_link_territory.UserID
    FROM          zipcodes INNER JOIN
    users_link_territory ON zipcodes.CountyID = users_link_territory.CountyID INNER JOIN
    us_co_est2005_alldata ON zipcodes.FIPS = us_co_est2005_alldata.STATE AND zipcodes.Code = us_co_est2005_alldata.COUNTY
    ) As FOO
GROUP BY UserID

Thanks everyone who contributed!

Russell Schutte

Russell Schutte
  • 237
  • 2
  • 4
  • 12

3 Answers3

8

If you just want an overall figure for it try

select sum(PopEstimate2005), sum(EstimatesBase2000)
from(
    SELECT  Distinct
        Zipcodes.CountyID, 
        us_co_est2005_allData.PopEstimate2005, 
        us_co_est2005_allData.EstimatesBase2000, 
        users_link_territory.userID 
    FROM 
        Zipcodes Inner Join 
        Users_link_territory ON zipcodes.CountyID = Users_link_territory.CountyID Inner Join 
        us_co_est2005_alldata ON zipcodes.FIPS = us_co_est2005_alldata.State AND zipcodes.code = us_co_est2005_alldata.County 
    WHERE 
        (users_link_territory.userid = 4)
) as foo
Robb
  • 3,801
  • 1
  • 34
  • 37
  • 2
    Hi Robb - Thanks for the quick reply. You're almost there... and I figured it out from your reply. Anyone trying to follow this or look it up later - Simply add the keyword DISTINCT back in to the inner SELECT, before the Zipcodes.CountyID. I don't know if there's a way to make the GroupBy work, but this is it! Thank you Robb! – Russell Schutte Dec 16 '10 at 20:26
  • @user541022 Glad it worked for you, I guess i must have mangled the code slightly while reformatting it, edited to include the missing distinct! – Robb Dec 16 '10 at 20:33
  • Thanks Robb... I had tried doing a subselect like that... but was fouled up when I got the "Incorrect Syntax" message about the last parenthesis. I never knew to add the "as foo". That's awesome. When do I need it? – Russell Schutte Dec 16 '10 at 20:37
  • @user541022 the `as foo` gives the table an alias, sqlserver throws a wobbly when you dont name your subselects. A hint you might find keeps rsi at bay is that you can alias any table in your from clause with a shorter name ie `us_co_est2005_alldata as dta` then reference it in your select clause as `dta.popestimate2005` – Robb Dec 16 '10 at 20:54
3

Use GROUP BY along with the SUM() and COUNT() aggregates.

SELECT count(*) as totalRows, Zipcodes.CountyID,
   sum(us_co_est2005_allData.PopEstimate2005) as SumPopEstimate2005, 
   sum(us_co_est2005_allData.EstimatesBase2000) as SumEstimatesBase2000,
  users_link_territory.userID

FROM
  Zipcodes Inner Join Users_link_territory ON zipcodes.CountyID = 
  Users_link_territory.CountyID Inner Join
  us_co_est2005_alldata ON zipcodes.FIPS = us_co_est2005_alldata.State AND zipcodes.code = us_co_est2005_alldata.County

WHERE (users_link_territory.userid = 4)

GROUP BY  Zipcodes.CountyID,users_link_territory.userID

Depending on your db server, this will be more efficient than doing a sub-select.

Byron Whitlock
  • 52,691
  • 28
  • 123
  • 168
2

An easy answer is use "group by". Group by has the same effect with the same fields as distinct, but allows you to use aggregate functions. You can add a "Having" clause after the group by to filter what records you would like to see.

Zachary Scott
  • 20,968
  • 35
  • 123
  • 205