0

When I have an "order by" clause inside of a hive query, for example:

SELECT *
FROM categories
ORDER BY category_name

The results will be sorted as all the capital letters first and then all the lower letters. I need some table constraint or configuration to enforce the below behavior. A session sorting with UPPER/LOWER won't help.

Current results:

AAA
KKK
ZZZ
aaa
bbb
yyy

Expected results:

aaa
AAA
bbb
KKK
yyy
ZZZ

Is there any configuration which enforces hive to sort the data Alphabetical sorting first?

Within sql it's a collation. Within oracle it's LTS.

What is the right configuration for this kind of expected sorting results, and where to set it?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
David Yaffe
  • 23
  • 1
  • 5

2 Answers2

1

How about just using lower()?

SELECT *
FROM categories
ORDER BY LOWER(category_name);

Note: this will be arbitrary about the case of the result. Because lower-case letters come after upper case in all modern collations, you could do:

SELECT c.*
FROM categories c
ORDER BY LOWER(c.category_name), c.category_name DESC;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I have other tool that query the tables. So i want some table constraint/ configuration to enforce it to behave like i mention. – David Yaffe Feb 22 '17 at 15:15
0

In order to implement the alphabetical sorting or any kind of sorting you can use cluster by in your query.

SELECT *
FROM categories
cluster BY LOWER(category_name);

You can alternatively use the distribute by with sort by option for more customized solution. SELECT * FROM categories DISTRIBUTE BY LOWER(category_name) SORT BY LOWER(category_name) DESC

Amit khandelwal
  • 504
  • 5
  • 6