0

I am trying to write data into a hive table and failing. I get a error at the end of Cycle_dt =null and only one partition being writing. It is the first day's.

set hive.auto.convert.join=true;
set hive.optimize.mapjoin.mapreduce=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true; 
set mapred.map.tasks = 100;



Insert into table dynamic.dynamic_test_avro_v1 partition(cycle_dt)
Select date_time as CYCLE_TS, case when evar1 is not null or length(trim(evar1)) > 0 then cast(unbase64(substring(evar1,6,12)) as string) end NRNM ,
prop14 as state, evar8 as FLOW_TYPE, prop25 as KEY, pagename PAGE_NM,
partition_dt as cycle_dt from source.std_avro_v1 WHERE
(partition_dt = '2016-10-02' AND  partition_dt < '2016-10-07')
AND (
evar8='google');

I am unsure what is going on here. I have a date range setup to only get those date's as partitions.

1 Answers1

2

From hive documentation:

In the dynamic partition inserts, users can give partial partition specifications, which means just specifying the list of partition column names in the PARTITION clause. The column values are optional. If a partition column value is given, we call this a static partition, otherwise it is a dynamic partition. Each dynamic partition column has a corresponding input column from the select statement. This means that the dynamic partition creation is determined by the value of the input column. The dynamic partition columns must be specified last among the columns in the SELECT statement and in the same order in which they appear in the PARTITION() clause.

So, in your query, partition_dt is the value of the dynamic partition. However, you impose the following constraints: (partition_dt = '2016-10-02' AND partition_dt < '2016-10-07') which translate to partition_dt = '2016-10-02' and eventually it creates a single partition.

You probably wanted a range of dates: (partition_dt >= '2016-10-02' AND partition_dt < '2016-10-07')

Alex Libov
  • 1,481
  • 2
  • 11
  • 20