1

I have a table with 3 columns Item, City and VisitNumber.

In this table there are a lot of records and I need to select top 20 items of each city (according to their visit numbers).

For example if there are 100 items in Washington and 250 items in New York and 500 items in Los Angeles, I need to select 60 most-visited rows for each city (20 for Washington, 20 for New York, 20 for Los Angeles)

How is it possible in TSQL?

ekad
  • 14,436
  • 26
  • 44
  • 46
Payam Sh
  • 581
  • 4
  • 9
  • 21

2 Answers2

20

The easiest way would be to use the row_number() window function to number the rows for each city according to their visitnumber descending and use that as a filter. This query should work in any SQL Server version from 2005 onwards.

select * 
from (
    select *, r = row_number() over (partition by City order by VisitNumber desc) 
    from your_table
    ) a 
where r <= 20
  and City in ('Washington', 'New York', 'Los Angeles')

This would select the top 20 items for each city specified in the where clause.

jpw
  • 44,361
  • 6
  • 66
  • 86
  • hey man! Bender is great! Would `and City in (select distinct top(3) city from table order by VisitNumber)` grab the top 3 cities? That way they don't have to specify the citties in each query. I don't know if you can do a distinct and top – Justin Hanley Jul 26 '15 at 16:21
  • Thanks for your answer. What if I don't know the name of cities? because they are being generated automatically. – Payam Sh Jul 26 '15 at 16:22
  • 1
    @PayamSh Then this approach might not work. What should be the criteria to choose what cities get selected? – jpw Jul 26 '15 at 16:26
  • @JustinHanley No it would not work as you can't combine distinct and order by in that way. – jpw Jul 26 '15 at 16:26
  • @PayamSh If there are 4 cities with the same number of visits, and you want top 20 rows from each but just 60 all in all, how do you choose which gets excluded? – jpw Jul 26 '15 at 16:30
  • 1
    You know, the table I provided above was just and example! my real table has more columns. maybe if visit numbers were the same I use another column to determine which rows to get. I just removed "and City in..." from your code and checked it. I think it works without providong city list! – Payam Sh Jul 26 '15 at 16:42
1

In one select:

select top 1 with ties
    Item, City, VisitNumber 
from
    your_table
where
    City in ('Washington', 'New York', 'Los Angeles')
order by 
    case 
        when row_number() over(partition by City order by VisitNumber desc) <= 20 
        then 0 
        else 1 
    end;

A complete answer might look like :

declare @Visists table (
    Item        int, 
    City        varchar(20),
    VisitNumber int
)

insert into @Visists values
(1, 'Washington', 11),
(2, 'Washington', 22),
(3, 'Washington', 33),
(4, 'Washington', 44),
(5, 'Washington', 55),
    (6, 'New York', 66),
    (7, 'New York', 77),
    (8, 'New York', 88),
    (9, 'New York', 99),
    (10, 'New York', 5),
(11, 'Los Angeles', 12),
(12, 'Los Angeles', 23),
(13, 'Los Angeles', 34),
(14, 'Los Angeles', 45),
(15, 'Los Angeles', 56),
    (16, 'Chicago', 9),
    (17, 'Chicago', 7),
    (18, 'Chicago', 3),
    (19, 'Chicago', 2),
    (20, 'Chicago', 9);

declare @TopVisitsPerCity int = 3;
declare @TopOfCities int = 3;

with TopVisitsPerCity as
(
    select top 1 with ties
        Item, City, VisitNumber 
    from
        @Visists
    order by 
        case 
            when row_number() over(partition by City order by VisitNumber desc) <= @TopVisitsPerCity
            then 0 
            else 1 
        end
)
select top (@TopOfCities * @TopVisitsPerCity)
    *
from 
    TopVisitsPerCity
order by 
    sum(VisitNumber) over (partition by City) desc;
Vadim Loboda
  • 2,431
  • 27
  • 44