0

I'm trying to configure RHive in the CDH4 environment. When reading a package 'RHive' in R, the error below got returned. I'm guessing that's due to wrong homes. If so, what would be the correct ones? Or if that's not the reason, what's wrong with that?

Any help would be very appreciated.

Thanks.

> Sys.setenv(HIVE_HOME="/etc/hive")
> Sys.setenv(HADOOP_HOME="/etc/hadoop")
> library(RHive)
Loading required package: rJava
Loading required package: Rserve
This is RHive 0.0-7. For overview type '?RHive'.
HIVE_HOME=/etc/hive
[1] "there is no slaves file of HADOOP. so you should pass hosts argument when you call rhive.connect()."
Error : .onLoad failed in loadNamespace() for 'RHive', details:
  call: .jnew("org/apache/hadoop/conf/Configuration")
  error: java.lang.ClassNotFoundException
In addition: Warning message:
In file(file, "rt") :
  cannot open file '/etc/hadoop/conf/slaves': No such file or directory
Error: package/namespace load failed for 'RHive'
TH Japan
  • 1
  • 1
  • 2

2 Answers2

2
Had the problems but solved it. Downside is that I have to keep track of a bunch of sym links

After struggling with install RHive_0.0-7.tar.gz on CDH 4.7.x and getting: 
Warning in file(file, "rt") :
cannot open file '/etc/hadoop/conf/slaves': No such file or directory
[1] "there is no slaves file of HADOOP. so you should pass hosts argument when you call rhive.connect()."

In /etc/hadoop/conf
I added a the following sym link ----> ln -s /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/etc/hadoop/conf.empty/slaves slaves
(why Cloudera CHD 4.7 installs in /opt without creating the proper sym links from /usr/lib is puzzling)

I also defined the followingin /usr/lib64/R/etc/Renviron
## set hive paths
HIVE_HOME='/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive'
HADOOP_HOME='/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop'
LD_LIBRARY_PATH='/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop'

At a shell prompt I ran R CMD INSTALL RHive_0.0-7.tar.gz 
Installation Happiness!!

++++++
Inside R-Studio (server)

>
> library(RHive)
Loading required package: rJava
Loading required package: Rserve
This is RHive 0.0-7. For overview type ‘?RHive’.
HIVE_HOME=/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hive
call rhive.init() because HIVE_HOME is set.
rhive.init()
>
+++++++
0

You should set the HADOOP_CONF_DIR separately. Try export $HADOOP_CONF_DIR=/etc/hadoop/conf/conf.pseudo

The conf.pseudo has the slaves file.

Though I'd be curious to see if you can make RHive work with CDH4.

Kumar Vaibhav
  • 2,632
  • 8
  • 32
  • 54
  • Thanks for the response! I'll give it a shot and post the results. – TH Japan Jun 10 '13 at 01:43
  • Seems like RHive does work with CDH4. You can take a look at http://stackoverflow.com/questions/16783549/rhive-not-working-with-cdh4 . Although I did not get a chance to take a look at it again. If you do get it work then please post how you did it. Thanks. – Kumar Vaibhav Jun 10 '13 at 07:00