0

how to check the partition location exist or not with oozie work flow using decision node. example: /user/cloudera/year=2016/month=201609/day=20150912

in my hdfs location i will get one data set every day like above.i.e...year=2016/month=201609/day=20150912

with the help of coordination job i will get the date value

<property>
        <name>today</name>
 <value>${coord:formatTime(coord:dateOffset(coord:dateTzOffset(coord:nominalTime(), "America/Los_Angeles"), -1, 'DAY'), 'yyyyMMdd')}</value>
 </property>

In my workflow with the help of decision node,how to check year=2016/month=201609/day=20150912 path exist or not?

Sai
  • 1,075
  • 5
  • 31
  • 58

3 Answers3

0

You can use the HCatalog EL Functions from the oozie workflow EL functions:

The format to specify a hcatalog table partition URI is

hcat://[metastore server]:[port]/[database name]/[table name]/[partkey1]=[value];[partkey2]=[value]. 

For example:

hcat://foo:8020/mydb/mytable/region=us;dt=20121212
YoungHobbit
  • 13,254
  • 9
  • 50
  • 73
0

It seems like this is the location that you would want to check:

/user/cloudera/year=${YEAR}/month=${YEAR}${MONTH}/day=${YEAR}${MONTH}${DAY}

Of course you would correct these with the right offset where required.

Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
0

Thank you for your prompt response @YoungHobbit and @Dennis Jaheruddin.

I wanted to use the decision node to check whether path is exist or not but not the URI. I have found out that the coordinate job and workflow.xml helped me to achieve the solution.

coordinate_job.xml

    <coordinator-app name="testemailjob" frequency="15" start="${jobStart}" end="${jobEnd}" timezone="America/Los_Angeles"  xmlns="uri:oozie:coordinator:0.2" >
  <controls>
    <execution>FIFO</execution>
  </controls>
  <action>
    <workflow>
      <app-path>${test}</app-path>
      <configuration>
     <property>
        <name>year</name>
          <value>${coord:formatTime(coord:dateOffset(coord:dateTzOffset(coord:nominalTime(), "America/Los_Angeles"), -1, 'DAY'), 'yyyy')}</value>
     </property>
    <property>
        <name>month</name>
          <value>${coord:formatTime(coord:dateOffset(coord:dateTzOffset(coord:nominalTime(), "America/Los_Angeles"), -1, 'DAY'), 'yyyyMM')}</value>
     </property>
      <property>
        <name>yesterday</name>
          <value>${coord:formatTime(coord:dateOffset(coord:dateTzOffset(coord:nominalTime(), "America/Los_Angeles"), -1, 'DAY'), 'yyyyMMdd')}</value>
     </property>
       <property>
        <name>today</name>
        <value>${coord:formatTime(coord:dateTzOffset(coord:nominalTime(), "America/Los_Angeles"), 'yyyyMMdd')}</value>
      </property>
        <property>
          <name>oozie.use.system.libpath</name>
          <value>True</value>
        </property>
       </configuration>
   </workflow>
  </action>
</coordinator-app>

My workflow.xml :

<workflow-app name= ......>
...........................
...............................
 <decision name="CheckFile">
         <switch>
            <case to="nextOozieTask">
             ${fs:exists(concat(concat(concat(concat(concat(concat(nameNode, path),year),"/month="),month),"/day="),today))}
            </case>
             <case to="nextOozieTask1">
              ${fs:exists(concat(concat(concat(concat(concat(concat(nameNode, path),year),'/month='),month),'/day='),yesterday))}
            </case>
            <default to="MailActionFileMissing" />
         </switch>
  </decision>
....................
......................
</workflow-app>
Sai
  • 1,075
  • 5
  • 31
  • 58