0

Let's say I have a table called COFFEE showing Bus Stations and all of the coffee shops within 10 blocks of the bus station:

    BusStationID| CoffeeShopID |  Distance (in city blocks)
    1|2103|2
    1|2222|2
    1|8864|7
    1|9920|5
    1|3544|2
    1|4830|2
    1|4823|6
    1|9561|2
    7|6262|2
    7|8561|10
    7|9510|5
    7|2744|1       
    7|4223|9
    7|5960|3

[EDITED: to make clear that the question is how to do this with a query not procedurally]

And I have to write a query (not a proc) to show for each bus-station, the average distance to the five closest coffee shops.

I can get the top 5 closest coffee shops for a particular bus-station:

           select avg(top5.distance) as AvgDistToFiveClosest
           from
           (
           select top 5 distance from COFFEE where busstationid = 1
           order by distance
           ) as top5

But how do I connect that as a subquery and make AvgDistToFiveClosest a column returned in my main query:

        select BusStationId,  AvgDistToFiveClosest
        from COFFEE...
         ??????

Given the sample data above, the query should return:

     BusStationID | AvgDistToFiveClosest
           1 | 2
           7 | 4
Tim
  • 8,669
  • 31
  • 105
  • 183

3 Answers3

2

This is Oracle (9g+) SQL code, corrected, I found an answer for a single select statement

with
distanceRanks as
(
  SELECT
    busstationid,
    distance,
    --rank the rows on the distance column from smallest to longest, and differentiate equal distances by rownum
    rank() over ( partition by busstationid
                  order by distance, rownum asc) as ranking
  FROM coffee
  ORDER BY 1 asc
)
SELECT busstationid, avg(distance)
FROM distanceRanks
WHERE ranking < 6
group by busstationid;
EplusL
  • 213
  • 1
  • 6
  • 14
  • Is there a way to do it non-procedurally, without loop control? I've edited the question to make that clearer. – Tim Aug 06 '12 at 19:15
  • The answer is updated for a single select statement. This is Oracle syntax; as there is no `top x` clause in oracle there is a need to rank the initial result set, and the code is cleaner when using `with ...as { }` syntax. Also this syntax is often overlooked but very powerful and very fast in Oracle, which is optimized for this kind of statement. – EplusL Aug 07 '12 at 14:30
  • +1 : Works in T-SQL too, though you need to check with the op as to whether RANK (ties allowing more than 5 entries to be averaged) or ROW_NUMBER should be used. *(At present you make an undocumented assumption)* – MatBailie Aug 07 '12 at 19:00
  • @Dems what do you mean about me making an assumption? OP specifically asked for the top 5 to be averaged, and I gave him code that does specifically that. – EplusL Aug 07 '12 at 19:32
  • With RANK() multiple items can all be tied at the same rank. This means that `RANK() < 6` can return more than 5 records. This *may* be correct behaviour according to the OPs needs. If the OP wants 5 records to be averaged, even in the event of ties, then ROW_NUMBER() is required. I apologise of I worded that poorly, I meant no offense (thus the +1). Only to highlight the different behaviours. – MatBailie Aug 07 '12 at 20:07
  • Yes it's true, but 2 things come to my mind: 1) the query will only grab more than 5 rows if and only if more than 5 rows have the same distance, and the average won't change anyway, meaning in the only case where more than 5 rows are grabbed the average doesn't change. If any less than 6 rows have an identical distance only 5 rows will be grabbed as specified. 2) And on top of that if you really do want to limit it to 5 rows all you need to do is add the rownum to the ranking order by clause to differentiate those rows and only grab 5. Problem solved. I edited the code in consequence. – EplusL Aug 08 '12 at 13:46
1

Try this:

SELECT c.BusStationID, AVG(c.distance)
FROM COFFEE c
WHERE c.CoffeeShopID IN 
(SELECT TOP 5 c2.CoffeeShopID FROM COFFEE c2 WHERE c2.BusStationID = c.BusStationID
ORDER BY c2.distance)
GROUP BY c.BusStationID
javcek
  • 696
  • 6
  • 3
0

Try this

Select distinct busstationid , (select top 5 avg(distance) from coffee ce where ce.busstationid = b.busstationid order by distance) as AvgDistToFiveClosest
From coffee b
ccStars
  • 817
  • 2
  • 11
  • 34