1

How do I write this in HIVE? This table has duplicates and based on the first unique column I want to exclude the duplicate records from the new table.

Sample tables

 data new;
    set old;
    by Col_1  Col_2date  Col_3date;

    if Col_2date ^=  Col_3date then do;
        if first.Col_3Date ^= 1 then delete;
    end;
run;
FunT
  • 67
  • 7
  • So you just want to select the first observation when sorted by those three columns? What does it mean when COL_2DATE is equal to COL_3DATE? – Tom Aug 20 '20 at 20:29
  • Thanks Tom, I added sample table and some explanation. I hope it helps, I am new to this so any tips would be great! – FunT Aug 20 '20 at 20:52

1 Answers1

0

How about the following:

select min(Obs)  as Obs, Col_11, Col_2date, Col_3date
  from your_table
 group by Col_11, Col_2date, Col_3date
serge_k
  • 1,772
  • 2
  • 15
  • 21