I'm running an AWS EMR Pig job using script-runner.jar as described here: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-script.html
Now, I want to hook up Netflix' Lipstick to monitor my scripts. I set up the server, and in the wiki here: https://github.com/Netflix/Lipstick/wiki/Getting-Started I can't quite figure out how to do the last step:
hadoop jar lipstick-console-[version].jar -Dlipstick.server.url=http://$LIPSTICK_URL
Should I substitute script-runner.jar with this?
Also, after following the build process in wiki I ended up with 3 different console jars:
lipstick-console-0.6-SNAPSHOT.jar
lipstick-console-0.6-SNAPSHOT-withHadoop.jar
lipstick-console-0.6-SNAPSHOT-withPig.jar
What is the purpose of the latter two jars?
UPDATE:
I think I'm making progress, but it still does not seem to work.
I set the pig.notification.listener parameter as described here and lipstick server url. There is more than one way to do it in EMR. Since I am using ruby API, I had to specify a step
hadoop_jar_step: jar: 's3://elasticmapreduce/libs/script-runner/script-runner.jar' properties: - pig.notification.listener.arg: com.netflix.lipstick.listeners.LipstickPPNL - lipstick.server.url: http://pig_server_url
Next, I added
lipstick-console-0.6-SNAPSHOT.jar
to hadoop classpath. For this, I had to create a bootstrap action as follows:bootstrap_actions: - name: copy_lipstick_jar script_bootstrap_action: path: #s3 path to bootstrap_lipstick.sh
where contents of bootstrap_lipstick.sh is
#!/bin/bash hadoop fs -copyToLocal s3n://wp-data-west-2/load_code/java/lipstick-console-0.6-SNAPSHOT.jar /home/hadoop/lib/
The bootstrap action copies the lipstick jar to cluster nodes, and /home/hadoop/lib/
is already in hadoop classpath (EMR takes care of that).
It still does not work, but I think I am missing something really minor ... Any ideas appreciated.
Thanks!