2

I have built and stored queries using HIVE and PIG that I would like to schedule to run on a weekly basis. The scripts create S3 files and update DynamoTables. What can I use to create an AmazonEMR Cluster to auto run these scripts on a scheduled basis.

I was thinking AWS Data Pipeline but it seems to require creation of data nodes and I do not think that would necessary for my purposes.

jwiora
  • 21
  • 2

1 Answers1

0

You are not required to specify data nodes if you disable staging on HiveActivity.

stage = false

Please see the example in this post.

Community
  • 1
  • 1
panther
  • 767
  • 5
  • 21