I’m trying to load data from a table with 1 column partitioned to a new table that has 2 partitioned columns, with the newer partitioned column being a regular column from the first table.
For example the create table statements (simplified and changed for understanding):
CREATE EXTERNAL TABLE `house`(
`color` string,
`size` string,
`owner` string)
PARTITIONED BY (
`dt` date)
CREATE EXTERNAL TABLE `house_new`(
`size` string,
`owner` string)
PARTITIONED BY (
`dt` date, `color` string)
I have a simple insert query like this:
INSERT OVERWRITE TABLE `house_new` PARTITION (`dt`='2018-06-10’, `color`)
SELECT
`size`,
`owner`,
`color`
FROM house
WHERE `dt`='2018-06-10';
So this runs all the way and says the job was successfully completed in both terminal and the Tez UI. Even during the whole process I can see files actually being populated in the tmp/staging folders on S3 with all the correct partitions. However when it finally finishes no files are actually written and all the temporary files are deleted.
Hard coding the partitions (static partitioning) works, and also dynamic partitioning from a 1 partitioned column table to another 1 partitioned column table works. Going from a 1 partitioned column table and trying to derive the 2nd partition from the 1st table does not work.