I'm programatically provisioning an EMR cluster using the Java SDK, and am trying to pass arguments to the setup-impala script. The code I have looks like this:
...
List<BootstrapActionConfig> bootstrapActions = new ArrayList<BootstrapActionConfig>();
// --base-path, s3://elasticmapreduce, --impala-version, 1.2.1
BootstrapActionConfig bsInstallImpala = new BootstrapActionConfig();
bsInstallImpala.setName( "Install Impala" );
ScriptBootstrapActionConfig scriptActionInstallImpala = new ScriptBootstrapActionConfig();
scriptActionInstallImpala.setPath("s3://elasticmapreduce/libs/impala/setup-impala");
List<String> impalaArgs = new ArrayList<String>();
impalaArgs.add( "--base-path, s3://elasticmapreduce" );
impalaArgs.add( "--impala-version, 1.2.1" );
scriptActionInstallImpala.setArgs(impalaArgs);
bsInstallImpala.setScriptBootstrapAction(scriptActionInstallImpala);
bootstrapActions.add( bsInstallImpala );
...
RunJobFlowRequest request = new RunJobFlowRequest()
.withName("OneButton Test")
.withSteps(enabledebugging, installHive, installPig)
.withLogUri("s3://somelogs/")
.withAmiVersion("3.0.4")
.withBootstrapActions(bootstrapActions)
.withInstances(new JobFlowInstancesConfig()
.withInstanceGroups(instanceGroups)
.withEc2KeyName("redacted")
.withHadoopVersion("2.2.0")
.withKeepJobFlowAliveWhenNoSteps(true)
.withTerminationProtected(true) );
But, when I send this request, the setup-impala script errors out as follows:
/usr/lib/ruby/1.8/optparse.rb:1450:in `complete': invalid option: --base-path, s3://elasticmapreduce (OptionParser::InvalidOption)
from /usr/lib/ruby/1.8/optparse.rb:1448:in `catch'
from /usr/lib/ruby/1.8/optparse.rb:1448:in `complete'
from /usr/lib/ruby/1.8/optparse.rb:1261:in `parse_in_order'
from /usr/lib/ruby/1.8/optparse.rb:1254:in `catch'
from /usr/lib/ruby/1.8/optparse.rb:1254:in `parse_in_order'
from /usr/lib/ruby/1.8/optparse.rb:1248:in `order!'
from /usr/lib/ruby/1.8/optparse.rb:1339:in `permute!'
from /usr/lib/ruby/1.8/optparse.rb:1360:in `parse!'
from /mnt/var/lib/bootstrap-actions/2/setup-impala:576:in `parse_arguments'
from /mnt/var/lib/bootstrap-actions/2/setup-impala:592:in `initialize'
from /mnt/var/lib/bootstrap-actions/2/setup-impala:902:in `new'
from /mnt/var/lib/bootstrap-actions/2/setup-impala:902
It looks like a problem with the syntax of the arguments for the bootstrap action, but I've tried every permutation that seems reasonable, and I always get this error (or a close approximation). But with the configuration listed above, when I view the cluster details in the web console, the arguments look identical to a cluster that I provisioned using the web console.
Any thoughts on what is going wrong here?