0

Here's the error I'm receiving

13288 [main] ERROR hive.ql.metadata.Hive  -   MetaException(message:java.lang.IllegalArgumentException: Pathname /apps/hive/warehouse/my_db.db/clog/${clogDataOutputDir}/logmessages.log.${hiveconf:current_date} from  hdfs://hdphio/apps/hive/warehouse/my_db.db/commlog/${commlogDataOutputDir}/logmessages.log.${hiveconf:current_date} is not a valid DFS filename.)

My hive script looks like this:

alter table my_db.clog add partition (create_dt = '${hiveconf:formatted_date}') location '${clogDataOutputDir}/logmessages.log.${hiveconf:current_date}';

And my oozie workflow

<workflow-app xmlns="uri:oozie:workflow:0.4" name="hive-add-partition-clog">
    <start to="add_hive_cl_partition"/>
    <action name="add_hive_cl_partition">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${resourceManager}</job-tracker>
            <name-node>${nameNode}</name-node>
            <job-xml>hive-site-conf.xml</job-xml>
            <script>AddPartition.sql</script>
            <param>formatted_date=${formatted_date}</param>
            <param>current_date=${current_date}</param>
           <file>${nameNode}/user/me/warehouse/etl/clog/clog_hive_daily/lib/hive-site-conf.xml</file>
    </hive>
    <ok to="end"/>
    <error to="fail"/>
   </action>
<kill name="fail">

The parameters formatted_date and current_date are coming from the coordinator app. Looking at the application log, both parameters resolve to what is expected.

Parameters:
 ------------------------
 formatted_date=2016-02-04
 current_date=20160204
 ------------------------

 Hive command arguments :
 --hiveconf
   hive.log4j.file=/grid/4/yarn/local/usercache/tchoedak/appcache/application_1454353377151_0855/container_e19_1454353377151_0855_01_000002/hive-log4j.properties
 --hiveconf
    hive.log4j.exec.file=/grid/4/yarn/local/usercache/tchoedak/appcache/application_1454353377151_0855/container_e19_1454353377151_0855_01_000002/hive-exec-log4j.properties
--hivevar
formatted_date=2016-02-04
--hivevar
current_date=20160204
-f
AddPartition.sql

My question is, why is Hive trying to treat the params I pass as a pathname? Is there a configuration I need to change in my WF to fix this?

tchoedak
  • 87
  • 1
  • 2
  • 10
  • My best guess is that the Hive parser fails to resolve parameter `${clogDataOutputDir}` in *location* and stops there. Then it tries to resolve the *location* clause into an HDFS path -- which is expected -- and fails because of the rogue `{`and `}` characters. Bottom line : you forgot to define one your parameters, boom. – Samson Scharfrichter Feb 17 '16 at 21:49
  • you were right! I removed that parameter, ran into another error with resolving ${hiveconf:current_date}, which I thought was the proper way to specify a parameter, and changed it to ${current_date} and now this workflow is running flawlessly. Thanks! – tchoedak Feb 17 '16 at 23:19

0 Answers0