I want to run the following command:
hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz" | hadoop fs -put - hdfs:///unzip_input/input
It works when I call it from the shell after I ssh onto the master node. But it will not work if I try to call it through ssh as follows:
ssh -i /home/USER/keypair.pem hadoop@ec2-XXXX.compute-1.amazonaws.com hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz" | hadoop fs -put - hdfs:///unzip_input/input
It gives the error:
zsh: command not found: hadoop
But if I take out the last pipe the command succeeds:
ssh -i /home/USER/keypair.pem hadoop@ec2-XXXX.compute-1.amazonaws.com hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz"
From some searching I've found that it may be due to an error with the JAVA_HOME not being set, but it is set correctly in ~/.bashrc on the master node
The hadoop clustter is an Amazon Elastic Map Reduce cluster.