3

Imagine I have a db table of Customers containing {id,username,firstname,lastname}

If I want to find how many instances there are of different firstnames I can do:

select firstname, count(*) from Customers group by 2 order by 1;

   firstname | count(*)
   ====================
   bob       |   1
   jeff      |   2
   adam      |   5

How do I count the number of firstnames that occur more than once? In pseudo-sql it would be something like:

select
    COUNT(
        firstname,
        count(*) as num_occurrences
    )
from 
    Customers 
group by 2 
having num_occurrences > 1;
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Andy Longwill
  • 624
  • 9
  • 25

3 Answers3

6

You have the right idea:

SELECT COUNT(*)
FROM (
  SELECT firstname
  FROM Customers
  GROUP BY firstname
  HAVING COUNT(*) >= 2
)

The subquery counts the first names that have occurred more than once. Then you count all of those first names. The HAVING clause allows you to filter by aggregates. It's like a WHERE clause, except you can use aggregate functions.

Welbog
  • 59,154
  • 9
  • 110
  • 123
2

There is no need for a subquery.

Try:

SELECT firstname, COUNT(*)
  FROM Customers
 GROUP BY firstname
HAVING COUNT(*) > 1
 ORDER BY firstname

Or, order by the most represented name:

SELECT firstname, COUNT(*) AS custcount
  FROM Customers
 GROUP BY firstname
HAVING COUNT(*) > 1
 ORDER BY custcount DESC;
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
IniTech
  • 781
  • 7
  • 20
  • Should be referencing `username`, not `firstname`. It also isn't providing the requested answer - the count of the number of usernames which have more than one row. If an INTO TEMP followed by SELECT from the temp table doesn't count as a sub-query, you can do that - but otherwise... – Jonathan Leffler Sep 02 '09 at 21:26
  • @Jonathon: The output isn't going to make sense without showing the username the count applies to. – OMG Ponies Sep 03 '09 at 00:04
  • @rexem: the question asks for the number of customers with more than one entry in the table with the same username. It does not ask for the usernames, per se; it asks for the number of such usernames. It is a classic aggregate-of-aggregates query, and the sub-query-in-the-FROM-clause is the best way to answer it. – Jonathan Leffler Sep 03 '09 at 02:41
  • @IniTech: looking at the question again, it seems to be a bit confused between username and firstname. The sample query selects on firstname, but the putative query result lists username. – Jonathan Leffler Sep 03 '09 at 02:43
1

This would do it:

select count(username)
  from (select username
          from Customers
         group by username
        having count(*) > 1);
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Asaph
  • 159,146
  • 25
  • 197
  • 199
  • NB: question was confused about username vs firstname - this answer used 'username' where the revised question goes with 'firstname'. – Jonathan Leffler Oct 15 '09 at 16:34