1

I recently installed CDH5.1.0 along with R 3.1.*, and I got rmr2, rJava, and rhdfs all installed properly. (along with the required packages and set the required environment variables) After some trouble with installing rhdfs I add this to my /usr/lib/R/etc/Renviron.site file:

HADOOP_HOME="usr/lib/hadoop"
HADOOP_CMD="usr/bin/hadoop"
HADOOP_STREAMING="usr/lib/hadoop-mapreduce/hadoop-streaming-2.3.0-cdh5.1.0.jar"

Then I started R and ran the following code:

>library(rmr2)
 loading required packages ...
>library(rJava)
>library(rhdfs)

HADOOP_CMD=usr/bin/hadoop

be sure to run hdfs.init()
>hdfs.init()
sh: 1: usr/bin/hadoop: not found
Error in system(command, intern = TRUE) : error in running command

I have seen similar problems with java class path, but I haven't found this specific problem anywhere else on the internet! Any help would be much appreciated.

user306603
  • 11
  • 2

1 Answers1

2

I had the same issue HDP 2.1. Looking at the following link from MapR and RevR on GitHub. It seems that the LD_LIBRARY_PATH variable needs to be set to the rJava.so variable.

HADOOP_STREAMING="/usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-76.jar"
HADOOP_CONF="/etc/hadoop/conf"
LD_LIBRARY_PATH="/usr/lib64/R/library/rJava/libs/rJava.so"
HADOOP_COMMON_LIB_NATIVE_DIR="/usr/lib/hadoop/lib/native/"

Now when running rhdfs in R I get the following warnings but it seems to work :

> hdfs.init()
14/11/12 09:20:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/11/12 09:20:43 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

> hdfs.ls(".")
  permission owner group    size          modtime                  file
1 drwx------  root  root       0 2014-11-07 09:50   /user/root/.staging
masegaloeh
  • 18,236
  • 10
  • 57
  • 106