2

I have a table with an identifier column id and another column with string values column_b for which I would like to do customer ordering on column_b. Let's say that column_b consists of values A, B, C, D.

The syntax that can be used in Hive for row_number() over() is:

SELECT id, column_b, row_number() over(partition by id order by column_b) as row_id
FROM   some_table

Example see here for an example

However, I would like to do custom ordering on column_b instead of alphabetic ordering. The above syntax would yield something like:

enter image description here

Instead, I would like to explicitly order by column_b using the order A, C, D, B, that is:

hive

How can I achieve this?

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Anonymous
  • 502
  • 4
  • 23

2 Answers2

3

Explicitly specify the order using case statement. You can use other scalar functions in the order by:

SELECT id, column_b, 
       row_number() over(partition by id order by case column_b
                                                       when 'A' then '1'
                                                       when 'C' then '2'
                                                       when 'D' then '3'
                                                       when 'B' then '4'
                                                       --add more cases
                                                       --for example other values sort
                                                       --in natural order  
                                                       else column_b 
                                                       --or use constant
                                                       --to make sure 
                                                       --everything else is greater than 4 
                                                       --like this  else concat('5',column_b)
                                                  end 
                         ) as row_id
  FROM some_table

Also you can calculate order column in the subquery and use it in the window, it will work the same:

SELECT id, column_b, 
       row_number() over(partition by id order by orderby) as row_id
  FROM (select t.*, 
               case column_b
                    when 'A' then '1'
                    when 'C' then '2'
                    when 'D' then '3'
                    when 'B' then '4'  
                    else concat('5',column_b) 
               end orderby
           from some_table t
        ) s
leftjoin
  • 36,950
  • 8
  • 57
  • 116
0

Please use below simple query,

select id, column_b,row_id from
(select id, column_b,row_id from table order by mod(row_id,2) != 0
union all
select id, column_b,row_id from table order by mod(row_id,2) = 0) qry;
Jim Macaulay
  • 4,709
  • 4
  • 28
  • 53