1

I noticed a weird Hive behaviour. It seems to not order columns right. this what I did:

select TAUX_REMU_RESEAU from t where id='000000010302'; // returns 7423.00

select TAUX_REMU_RESEAU from t order by TAUX_REMU_RESEAU desc; // and the first element was 9.60

How is this possible? Any explanation on how does Hive order its number columns please? Thank you.

leftjoin
  • 36,950
  • 8
  • 57
  • 116
Haha
  • 973
  • 16
  • 43

2 Answers2

0

It depends on which type column is.

If it is numeric type, their values compared:

with your_table as (
select stack(2, 7423.00, 9.60) as TAUX_REMU_RESEAU  
)

select * from your_table order by TAUX_REMU_RESEAU desc;

Result:

your_table.taux_remu_reseau
7423
9.6

If it is string:

with your_table as (
select stack(2, '7423.00', '9.60') as TAUX_REMU_RESEAU  
)

select * from your_table order by TAUX_REMU_RESEAU desc;

Result:

your_table.taux_remu_reseau
9.60
7423.00

Strings are compared lexicographically:

This is the definition of lexicographic ordering.

If two strings are different, then either they have different characters at some index that is a valid index for both strings, or their lengths are different, or both. If they have different characters at one or more index positions, let k be the smallest such index; then the string whose character at position k has the smaller value, as determined by using the < operator, lexicographically precedes the other string. In this case, compareTo returns the difference of the two character values at position k in the two string - that is, the value: this.charAt(k)-anotherString.charAt(k)

If there is no index position at which they differ, then the shorter string lexicographically precedes the longer string. In this case, compareTo returns the difference of the lengths of the strings - that is, the value: this.length()-anotherString.length()

See the String.compareTo source code for better understanding:

public int compareTo(String anotherString) {

        int len1 = value.length;

        int len2 = anotherString.value.length;

        int lim = Math.min(len1, len2);

        char v1[] = value;

        char v2[] = anotherString.value;


        int k = 0;

        while (k < lim) {

            char c1 = v1[k];

            char c2 = v2[k];

            if (c1 != c2) {

                return c1 - c2;

            }

            k++;
        }

    return len1 - len2;
}

compareTo method returns after comparing characters at 0 position (k=0): 7 and 9.

leftjoin
  • 36,950
  • 8
  • 57
  • 116
-1

Presumably, the id is stored as a number of some sort -- an integer, decimal or whatever.

And 9.60 < 000000010302 as a number.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I think you misread the question. The first query proves that I have a TAUX_REMU_RESEAU = 7423.00 somewhere in my table. The second query which is supposed to order all TAUX_REMU_RESEAU in my table shows that the max one is equal to 9.60. Where did 7423.00 go? – Haha Jan 10 '20 at 14:28