I've run Hive on elastic mapreduce in interactive mode:
./elastic-mapreduce --create --hive-interactive
and in script mode:
./elastic-mapreduce --create --hive-script --arg s3://mybucket/myfile.q
I'd like to have an application (preferably in PHP, R, or Python) on my own server be able to spin up an elastic mapreduce cluster and run several Hive commands while getting their output in a parsable form.
I know that spinning up a cluster can take some time, so maybe my application might have to do that in a separate step and wait for the cluster to become ready. But is there any way to do something like this somewhat concrete hypothetical example:
- create Hive table customer_orders
- run Hive query "SELECT dt, count(*) FROM customer_orders GROUP BY dt"
- wait for result
- parse result in PHP
- run Hive query "SELECT MAX(id) FROM customer_orders"
- wait for result
- parse result in PHP ...
Does anyone have any recommendations on how I might do this?