7

I have begun working some of the programming problems on HackerRank as a "productive distraction".

I was working on the first few in the SQL section and came across this problem (link):

Query the two cities in STATION with the shortest and 
longest CITY names, as well as their respective lengths 
(i.e.: number of characters in the name). If there is 
more than one smallest or largest city, choose the one 
that comes first when ordered alphabetically.

Input Format

The STATION table is described as follows:

enter image description here

where LAT_N is the northern latitude and LONG_W is 
the western longitude.

Sample Input

Let's say that CITY only has four entries: 
1. DEF
2. ABC
3. PQRS
4. WXY

Sample Output

ABC 3
PQRS 4

Explanation

When ordered alphabetically, the CITY names are listed 
as ABC, DEF, PQRS, and WXY, with the respective lengths
3, 3, 4 and 3. The longest-named city is obviously PQRS, 
but there are  options for shortest-named city; we choose 
ABC, because it comes first alphabetically.

I agree that this requirement could be written much more clearly, but the basic gist is pretty easy to get, especially with the clarifying example. The question I have, though, occurred to me because the instructions given in the comments for the question read as follows:

/*
Enter your query here.
Please append a semicolon ";" at the end of the query and 
enter your query in a single line to avoid error.
*/

Now, writing a query on a single line doesn't necessarily imply a single query, though that seems to be the intended thrust of the statement. However, I was able to pass the test case using the following submission (submitted on 2 lines, with a carriage return in between):

SELECT TOP 1 CITY, LEN(CITY) FROM STATION ORDER BY LEN(CITY), CITY;
SELECT TOP 1 CITY, LEN(CITY) FROM STATION ORDER BY LEN(CITY) DESC, CITY;

Again, none of this is advanced SQL. But it got me thinking. Is there a non-trivial way to combine this output into a single results set? I have some ideas in mind where the WHERE clause basically adds some sub-queries in an OR statement to combine the two queries into one. Here is another submission I had that passed the test case:

SELECT 
    CITY, 
    LEN(CITY) 
FROM 
    STATION 
WHERE 
    ID IN (SELECT TOP 1 ID FROM STATION ORDER BY LEN(CITY), CITY) 
OR 
    ID IN (SELECT TOP 1 ID FROM STATION ORDER BY LEN(CITY) DESC, CITY)
ORDER BY 
    LEN(CITY), CITY;

And, yes, I realize that the final , CITY in the final ORDER BY clause is superfluous, but it kind of makes the point that this query hasn't really saved that much effort, especially against returning the query results separately.

Note: This isn't a true MAX and MIN situation. Given the following input, you aren't actually taking the first and last rows:

Sample Input
1. ABC
2. ABCD
3. ZYXW

Based on the requirements as written, you'd take #1 and #2, not #1 and #3.

This makes me think that my solutions actually might be the most efficient way to accomplish this, but my set-based thinking could always use some strengthening, and I'm not sure if that might play in here or not.

Vikash Pandey
  • 5,407
  • 6
  • 41
  • 42
mbm29414
  • 11,558
  • 6
  • 56
  • 87
  • Use a union? or is that not non-trivial? – xQbert Aug 24 '16 at 14:18
  • Perhaps I could have been clearer myself in the wording of my question. ;-) I guess what I'm asking is whether there is any way to avoid writing 2 completely separate and almost parallel WHERE clauses/criteria. I'm also thinking this requirement might be sufficiently complex that it doesn't exist, but I'm trying to expand my knowledge, thus the question. – mbm29414 Aug 24 '16 at 14:20
  • How about using analytic functions without the top? – xQbert Aug 24 '16 at 14:21

6 Answers6

2

Untested as well, but I don't see a reason for it not to work:

SELECT *
FROM (
    SELECT TOP (1) CITY, LEN(CITY) AS CITY_LEN
    FROM STATION
    ORDER BY CITY_LEN, CITY
    ) AS T
UNION ALL
SELECT *
FROM (
    SELECT TOP (1) CITY, LEN(CITY) AS CITY_LEN
    FROM STATION
    ORDER BY CITY_LEN DESC, CITY
    ) AS T2;

You cant have UNION ALL with ORDER BY for each SELECT statement, but you can workaround it by using subqueries togeter with TOP (1) clause and ORDER BY.

Evaldas Buinauskas
  • 13,739
  • 11
  • 55
  • 107
2

Here's another alternative. I think it's pretty straight forward, easy to understand what's going on. Performance is good.

Still has a couple of sub-queries though.

select 
   min(City), len(City)
from Station 
group by 
   len(City)
having 
   len(City) = (select min(len(City)) from Station)
   or 
   len(City) = (select max(len(City)) from Station)
jim31415
  • 8,588
  • 6
  • 43
  • 64
1

UNTESTED:

WITH CTE AS (
Select ID, len(City), row_number() over (order by City) as AlphaRN,
row_number() over (order by Len(City) desc) as LenRN) B
Select * from cte 
Where AlphaRN = 1 and (lenRN = (select max(lenRN) from cte) or 
                       lenRN = (Select min(LenRN) from cte))
xQbert
  • 34,733
  • 2
  • 41
  • 62
  • I get the overall logic here, and it appears to me that it would work. Is the `B` necessary? Also, while I appreciate the creativity of your solution (always good to get a different perspective!), I think your `UNION` idea might be cleaner and less complex. – mbm29414 Aug 24 '16 at 14:30
1

This is what I came with. I tried to use only one query, without CTE's or sub-queries.

;WITH STATION AS ( --Dummy table
SELECT *
FROM (VALUES
(1,'DEF','EU',1,9),
(2,'ABC','EU',1,6), -- This is shortest 
(3,'PQRS','EU',1,5),
(4,'WXY','EU',1,4),
(5,'FGHA','EU',1,2),
(6,'ASDFHG','EU',1,3) --This is longest 
) as t(ID, CITY, [STATE], LAT_N,LONG_W)
)


SELECT TOP 1 WITH TIES  CITY,
                        LEN(CITY) as CITY_LEN
FROM STATION
ORDER BY ROW_NUMBER() OVER(PARTITION BY LEN(CITY) ORDER BY LEN(CITY) ASC),
        CASE WHEN MAX(LEN(CITY)) OVER (ORDER BY (SELECT NULL)) = LEN(CITY) 
                OR MIN(LEN(CITY)) OVER (ORDER BY (SELECT NULL))= LEN(CITY) 
                        THEN 0 ELSE 1 END

Output:

CITY    CITY_LEN
ABC     3
ASDFHG  6
gofr1
  • 15,741
  • 11
  • 42
  • 52
1

Here's the best I could come up with:

with Ordering as
(
    select
        City,
        Forward = row_number() over (order by len(City), City),
        Backward = row_number() over (order by len(City) desc, City)
    from
        Station
)
select City, len(City) from Ordering where 1 in (Forward, Backward);

There are definitely a lot of ways to approach this as evidenced by the variety of answers, but I don't think anything beats your original two-query solution in terms of cleanly and concisely expressing the intended behavior. Interesting question, though!

Joe Farrell
  • 3,502
  • 1
  • 15
  • 25
-1

select min(CITY), length(CITY) from STATION group by length(CITY) having length(CITY) = (select min(length(CITY)) from STATION) or length(CITY) = (select max(length(CITY)) from STATION);

Aravindh V
  • 11
  • 1