I'm trying to run an hive script on AWS EMR using the php sdk. How can I pass the script parameters (like, input, output and dates to work on)?
Thanks
I'm trying to run an hive script on AWS EMR using the php sdk. How can I pass the script parameters (like, input, output and dates to work on)?
Thanks
If you are struggling with this as well...
A sample code for passing variables to hive script can be found at the following Amazon Forum Thread
I've done this with the Java SDK, using the PHP SDK essentially what you need to do is parse in the parameters you want with add_job_flow_steps function
You need to add the parameters to the StepConfig (for the script you are running) in the "Args" array when calling the function.
Args - string|array - Optional - A list of command line arguments passed to the JAR file’s main function when executed. Pass a string for a single value, or an indexed array for multiple values.
The format of the arguments is a bit confusing, you need to have an array of the form
("-d","yourVariable=itsValue","-d","anotherVariable=AnotherValue")
So it should end up looking a bit like this:
add_job_flow_steps('j-19430859jg9',array( new CFStepConfig(array(
'Name' => 'Run a hive script',
'HadoopJarStep' => array( 'Jar' => CFHadoopStep::run_hive_script(),
'Args' => array("-d","yourVariable=itsValue","-d","anotherVariable=AnotherValue")
))))
I don't know if the syntax is quite right, I haven't tried it.
At least this is how it is for java, maybe for PHP you may need to have an associate array, I would try a variety of formats.
I expect this is so that these parameters are not confused with other hadoop/hive configuration parameters.
You can then access these variables in the script in a similar way to as in bash, using ${yourVariable}.
SELECT * FROM TABLE WHERE column='${yourVariable};