-1

Hello I am very new to Hive and was learning WINDOWING functionality of Hive. I came across a problem.

I was trying to find the lowest closing price for each stock ticker (Each ticker have 22 records and I wanted to find the lowest) I wrote a Query:

SELECT ticker, close FROM 
   (SELECT ticker, close, RANK() OVER (PARTITION BY ticker) AS rank FROM stocks) AS p 
WHERE rank = 1 LIMIT 10;

I got the result

ticker  close
   A    28.15
   A    27.93
   A    28.82
   A    27.84
   A    28.29
   A    28.46
   A    27.58
   A    28.73
   A    29.82
   A    29.3

But I wanted one for each ticker.

Then I ran the same query but added the ORDER BY clause

SELECT ticker, close FROM 
 (SELECT ticker, close, RANK() OVER (PARTITION BY ticker ORDER BY close) AS rank FROM stocks) AS p 
 WHERE rank = 1 LIMIT 20;

And I got the ideal result.

ticker  close
   A    27.16
  AA    10.57
  AAPL  247.64
  ABC   28.71
  ABT   48.68
 ACE    52.43
 ADBE   27.36
 ADI    28.07
 ADM    27.0
 ADP    39.4

My Question here is how is this order by grouping tickers?

Dereck
  • 95
  • 3
  • 12

1 Answers1

0

There is no order if you don't specify the "order by" clause, for this reason all the elements in the group have the same RANK, if you want different ranks you have to use DENSE_RANK function.

hlagos
  • 7,690
  • 3
  • 23
  • 41
  • why is the rank same within the group, actually when I ran rank and dense rank everyone have value 1 for both rank and dense rank? – Dereck Jul 17 '17 at 21:29
  • can you create a insert script with the values to replicate the issue? – hlagos Jul 18 '17 at 13:12