1

I have written table in hive using hive warehouse connector. But I am unable to read the content of the same after writing. Below are the details of commands used:

Commands to write the data:


hive.createTable("sales_22feb").ifNotExists().column("userid","string").column("ordertime","string").column("saleamount","string").column("orderid","string").create()

val df= spark.read.csv("/tmp/sales.csv")

df.write.format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector").option("table", "sales_22feb").mode("append").save()

 

Commands to read data:

val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build() 

hive.setDatabase("default")

val df=hive.executeQuery("select * from sales_22feb")

df.show(5,false)
 

Getting below error:

20/06/25 16:43:54 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 4, sandbox-hdp.hortonworks.com, executor 1):java.lang.RunTimeException: java.lang.NullPointerException: hive.llap.daemon.service.hosts must be defined

However I am able to see contents of the same using hive shell. Kindly help me on this.

Thanks in advance.

avikm
  • 511
  • 1
  • 7
  • 23
Divya Jain
  • 23
  • 7

2 Answers2

1

You will need to set "spark.hadoop.hive.llap.daemon.service.hosts" to the name of the llap daemon in spark-defaults.conf or pass using --conf.

For example: If llap daemon is @llap_LlapCluster, you need specify it as below in spark-defaults.conf :

spark.hadoop.hive.llap.daemon.service.hosts @llap_LlapCluster

ranger_sim_g
  • 160
  • 1
  • 10
0

From HDP 3.X onwards reading/writing internal hive table is supported via Hive Warehouse Connector (HWC) framework.

When ever you are writing to hive table Hive LLAP Service is not required.

When ever you are reading from hive table Hive LLAP Service is required. Check the below documentation for configuring spark HWC.

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/integrating-hive/content/hive_configure_a_spark_hive_connection.html

Ranga Reddy
  • 2,936
  • 4
  • 29
  • 41