10

In my hive table, the session field is a string in format like:

ip-sessionID-userID or area-sessionID-userID

There's 3 or 4 fields separated by "-", but userID is always the last one.

i wanna select userID, but how to access the last field? In python, there's something like: arr[-1]

but in hive, how to achieve this? The following SQL seems not correct.

select split(session,"\-")[-1] as user from my_table;

Thanks!

leftjoin
  • 36,950
  • 8
  • 57
  • 116
qiuxiafei
  • 5,827
  • 5
  • 30
  • 43

4 Answers4

19
reverse(split(reverse(session), '-')[0])

Although this might be a bit more expensive than the regex solution ;)

arno_v
  • 18,410
  • 3
  • 29
  • 34
2

Because Non-constant expressions for array indexes not supported in hive.

There will be some other ways to solve your problem:

  1. use regexp_extract, such as :

    select regexp_extract(session, '(\-[^\-]+)', 1) as user from my_table;

  2. use custom hive function : example and document could be found in hive document

pensz
  • 1,871
  • 1
  • 13
  • 18
1

One more method without reverse, using array size()-1. This method is more efficient because it does not produce intermediate reversed string, it works only with array.

select array[size(array)-1] as userID
from
( select split(session,'-') array from your_table ) s;
leftjoin
  • 36,950
  • 8
  • 57
  • 116
0

I think the following would be faster than the solution from @arno_v, because it only reverse the view array elements, instead of the whole string, and only reverse once:

reverse(split(session, '-'))[0]
Chris
  • 115
  • 9