I want to run PIG in local mode, which is very easy
pig -x local file.pig
My requirement is to run PIG in local mode from OOZIE? Is it possible as i think OOZIE will automatically launch map task first?
I want to run PIG in local mode, which is very easy
pig -x local file.pig
My requirement is to run PIG in local mode from OOZIE? Is it possible as i think OOZIE will automatically launch map task first?
It's possible. When a pig script is run by Oozie, it's run as a one-map map-reduce job, which only runs the pig script, which in turn runs other map-reduce jobs (when pig is run in mapred
mode).
It seems, that Pig action configuration doesn't allow running in local mode, but you can still run Pig script in local mode using shell action type. You only have to make sure, that your script, input and output data are in HDFS.
I don't think, we can run pig in local mode from oozie. Comment which Vishal wrote makes sense. In some cases, where there is lesser amount of data, Its better to go for pig in local mode. To run in local mode, you can run by writing a shell script and scheduling that in crontab.If you try this through oozie. Upto my knowledge It won't suit well , because Oozie is meant to run in HDFS.
If you want oozie to run on some data . It expects that data to be in HDFS (i.e distributed).And You must have the pig script as well in hdfs.I rembered seeing post from AlanGates where he mentioned PIG is designed to process data from/to HDFS and hive is for local to HDFS or HDFS to HDFS.