1

Why is R not connecting to Hadoop ?

I am using R to connect to HDFS using 'rhdfs' package. The 'rJava' package is installed and rhdfs package is loaded.

The HADOOP_CMD environment variable is set in R using:

Sys.setenv(HADOOP_CMD='/usr/local/hadoop/bin')

But when hdfs.init() function is given, the following error message is generated:

sh: 1: /usr/local/hadoop/bin: Permission denied
Error in .jnew("org/apache/hadoop/conf/Configuration") : 
java.lang.ClassNotFoundException
In addition: Warning message:
running command '/usr/local/hadoop/bin classpath' had status 126 

Also, 'rmr2' library was loaded, and the following code was typed:

ints = to.dfs(1:100)

which generated the message given below:

sh: 1: /usr/local/hadoop/bin: Permission denied

The R-Hadoop packages are accessible only to the 'root' user and not 'hduser' (Hadoop user), since they were installed when R was run by the 'root' user.

User456898
  • 5,704
  • 5
  • 21
  • 37

2 Answers2

2

Simple, only 2 reasons to get this type of problem

1) Wrong path 2) No privileges/permissions to that jar ok not only that include other system paths. such as given below.

Sys.setenv(HADOOP_HOME="/home/hadoop/path")

Sys.setenv(HADOOP_CMD="/home/hadoop/path/bin/hadoop")

Sys.setenv(HADOOP_STREAMING="/home/hadoop/path/streaming-jar-file.jar")

Sys.setenv(JAVA_HOME="/home/hadoop/java/path")

Then include ibrary(rmr2) and library(rhdfs) paths, surely that error don't occur.

But your problem is Permission problem. So as a root grant all privileges (755) to you then run that jar file, surely that error don't display.

DatamineR
  • 10,428
  • 3
  • 25
  • 45
Venu A Positive
  • 2,992
  • 2
  • 28
  • 31
1

try like this.

Sys.setenv(HADOOP_CMD='/usr/local/hadoop/bin/hadoop')

Sys.setenv(JAVA_HOME='/usr/lib/jvm/java-6-openjdk-amd64')

library(rhdfs)

hdfs.init()

please give the correct HADOOP_CMD path  extend with   /bin/hadoop
Sravan K Reddy
  • 1,082
  • 1
  • 10
  • 19
  • When `library(rmr2)` is given, it gives a warning: `Please review your hadoop settings. See help(hadoop.settings) Warning message: S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found`. Could you tell me why ? – User456898 Apr 16 '15 at 17:34