4

I need to update /etc/hosts for all instances in my EMR cluster (EMR AMI 4.3).

The whole script is nothing more than:

#!/bin/bash
echo -e 'ip1 uri1' >> /etc/hosts
echo -e 'ip2 uri2' >> /etc/hosts
...

This script needs to run as sudo or it fails.

From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-bootstrap.html#bootstrapUses

Bootstrap actions execute as the Hadoop user by default. You can execute a bootstrap action with root privileges by using sudo.

Great news... but I can't figure out how to do this, and I can't find an example.

I've tried a bunch of things... including...

  • running as Hadoop and adding 'sudo' to each of the 'echo' statements in the script
  • using a shell script to copy and chmod the above ('echo' statements with no 'sudo') and running local copy using run-if bootstrap that calls 1=1 sudo bash /home/hadoop/myDir/myScript.sh
  • hard coding the whole script as a one-liner into a run-if bootstrap action

I consistently get:

On the master instance (i-xxx), bootstrap action 2 returned a non-zero return code

If i check the logs for the "Setup hadoop debugging" step, there's nothing there.

From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview.html#emr-overview-cluster-lifecycle

summary emr setup (in order):

  1. provisions ec2 instances
  2. runs bootstrap actions
  3. installs native applications... like hadoop, spark, etc.

So it seems like there's some risk that since I'm mucking around as user Hadoop before hadoop is installed, I could be messing something up there, but I can't imagine what.

I think it must be that my script isn't running as 'sudo' and it's failing to update /etc/hosts.

My question... how can I use bootstrap actions (or something else) on EMR to run a simple shell script as sudo? ...specifically to update /etc/hosts?

kmh
  • 1,516
  • 17
  • 33
  • 1
    I've not had problems using sudo within EMR bootstrap actions. If you launch a cluster, then SSH into one of the nodes as hadoop, can you sudo? Can you try with a trivial BA that does cat /etc/hosts and then sudo cat /etc/hosts with a sprinkling of echo "I'm now about to do X" to debug progress. – jarmod Aug 29 '18 at 19:34
  • if I ssh in, it's no problem to run commands as 'sudo'. i'll run some trivial examples like you suggest and see if I can run them without the cluster terminating due to bootstrap error so i can view results. One question... you say "I've not had problems using sudo w/in EMR bootstrap actions." Do you mean that you've used 'sudo' in the shell script, and the script runs ok by default (as user Hadoop)? – kmh Aug 29 '18 at 20:30
  • 1
    Correct, shell scripts run as BAs where the shell scripts invoke sudo for certain commands. – jarmod Aug 29 '18 at 20:42
  • did what you suggested. `cat /etc/hosts` and `sudo cat /etc/hosts` and echo's all ran fine and output was in /mnt/var/log/bootstrap-actions/1 for both master and executors. however, on existing cluster, as Hadoop... tried running shell script with `sudo echo -e 'ip1 uri1' >> /etc/hosts` and it runs ok when run as sudo, but fails when run as hadoop with error message `/etc/hosts: Permission denied` – kmh Aug 29 '18 at 21:12
  • 1
    Try: sudo sh -c 'echo -e "ip1 uri1" >> /etc/hosts' – jarmod Aug 29 '18 at 21:30
  • 1
    In addition to jarmod's answer below, if you're running a command with a heredoc, you can follow this answer to solve your problem: https://stackoverflow.com/a/4412091/1994092 – Steve Nov 19 '20 at 15:40

1 Answers1

10

I've not had problems using sudo from within a shell script run as an EMR bootstrap action, so it should work. You can test that it works with a simple script that simply does "sudo ls /root".

Your script is trying to append to /etc/hosts by redirecting stdout with:

sudo echo -e 'ip1 uri1' >> /etc/hosts

The problem here is that while the echo is run with sudo, the redirection (>>) is not. It's run by the underlying hadoop user, who does not have permission to write to /etc/hosts. The fix is:

sudo sh -c 'echo -e "ip1 uri1" >> /etc/hosts'

This runs the entire command, including the stdout redirection, in a shell with sudo.

jarmod
  • 71,565
  • 16
  • 115
  • 122