0

I have a hive table with field 'a'(int), 'b'(string), 'c'(bigint), 'd'(bigint) and 'e'(string).
I have data like:

a  b  c   d   e
---------------
1  a  10  18  i
2  b  11  19  j
3  c  12  20  k
4  d  13  21  l
1  e  14  22  m
4  f  15  23  n
2  g  16  24  o
3  h  17  25  p

Table is sorted on key 'b'.
Now we want output like below:

a  b  c   d   e
---------------
1  e  14  22  m
4  f  15  23  n
2  g  16  24  o
3  h  17  25  p

which will be deduped on key 'a' but will keep last(latest) 'b'.

Is it possible using Hive query(HiveQL)?

chanchal1987
  • 2,320
  • 7
  • 31
  • 64

1 Answers1

1

If column b is unique, Try follow hql:

select 
* 
from
(
    select max(b) as max_b
    from
    table
    group by a
) table1
join table on table1.max_b = table.b
pensz
  • 1,871
  • 1
  • 13
  • 18