I have some pig batch jobs in .pig files I'd love to automatically run on EMR once every hour or so. I found a tutorial for doing that here, but that requires using Amazon's GUI for every job I setup, which I'd really rather avoid. Is there a good way to do this using Whirr? Or the Ruby Elastic-mapreduce client? I have all my files in s3, along with a couple pig jars with functions I need to use.
Asked
Active
Viewed 463 times
2
-
This question may have an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) as it focusses on suggested solutions. – Dennis Jaheruddin Jun 06 '16 at 10:47
1 Answers
-1
Though I don't know how to run pig scripts with the tools that you mention, I know of two possible ways:
- To run files locally: you can use cron
- To run files on the cluster: you can use OOZIE
That being said, most tools with a GUI, can be controlled via the command line as well. (Though setup may be easier if you have the GUI available).

Dennis Jaheruddin
- 21,208
- 8
- 66
- 122