1

I'm using hadoop-2.2.0 and hive-0.12. I followed the following steps to try to connect to Hive in Rstudio:

library("DBI")
library("rJava")
library("RJDBC")
for(l in list.files('/PATH/TO/hive/lib/')){ .jaddClassPath(paste("/PATH/TO/hive/lib/",l,sep=""))}
for(l in list.files('/PATH/TO/hadoop/')){ .jaddClassPath(paste("/PATH/TO/hadoop/",l,sep=""))}
options( java.parameters = "-Xmx8g" )
drv <- JDBC("org.apache.hive.jdbc.HiveDriver", "/PATH/TO/hive/lib/hive-jdbc.jar")
conn <- dbConnect(drv, "jdbc:hive2://HOST:PORT", USER, PASSWD)

But I got the following error:

Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1],  : 
  java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

Any tips will be appreciated.

flyer
  • 9,280
  • 11
  • 46
  • 62

2 Answers2

1

The problem is solved.

I load all of the jar packages in the hadoop dir and then I can connect to Hive.

flyer
  • 9,280
  • 11
  • 46
  • 62
0

you can simply connect to hiveserver2 from R using RHIVE package

below are the commands that i had used.

Sys.setenv(HIVE_HOME="/usr/local/hive") Sys.setenv(HADOOP_HOME="/usr/local/hadoop") rhive.env(ALL=TRUE) rhive.init() rhive.connect("localhost")
Dinesh
  • 23
  • 7