0

I have a partitioned hive table partitioned on column 'part'. The table has two partition values part='good' and part='bad'.

I need to move a record from 'bad' partition into 'good' partition and overwrite 'bad' partition to remove that moved record. To complicate this, I am looking for a way to do it in a single query as exception handling would be difficult otherwise.

I tried to do it with multi-table insert having two insert queries on the same table as below,

from tbl_partition
insert into tbl_partition partition (part='good') select a,b,c where a='a' and part='bad' -- this is where a record is moved from bad to good
insert overwrite table tbl_partition partition (part='bad') select a,b,c where part='bad' and a not in ('a'); -- Overwrite the bad partition excluding already moved record

But the above query always does an insert into, rather than one insert and the other insert overwrite!!

I even tried with a common table expression and used the common table to insert simultaneously into this table with no luck!

Is there any other way this can be achieved in a single query or am I doing something wrong in the above step?

Please note that I am doing this on a HDP cluster with hive 1.2

Adiga
  • 121
  • 6
  • Create an empty partition `newbad`; multi-insert to `good` and `newbad`; drop `bad`; rename `newbad` to `bad`. 4 commands. But it works. – Samson Scharfrichter Oct 31 '18 at 19:47
  • Nice idea. I can improvise on this idea as i had not completely explained the use case above. Thank you. However, exception handling could be challenging here as multiple steps are involved here! – Adiga Nov 05 '18 at 05:50
  • With distributed systems, **everything** is "challenging" -- and even when the framework tries to hide the complexity, sometimes it hits an edge case (or a bug) and explodes in your hands... Never assume you have a safe path, get ready for manual recovery at any step of any job. – Samson Scharfrichter Nov 05 '18 at 08:38

0 Answers0