1

From my Data Science Experience, I am able to make a connection to the Hive database in BigInsights and read the table schema. But Data Science Experience does not seem to be able to read the table contents as I get a count of zero! Here are some of my settings:

conf = (SparkConf().set("com.ibm.analytics.metadata.enabled","false"))      

spark = SparkSession.builder.enableHiveSupport().getOrCreate()    

dash = {
'jdbcurl': 'jdbc:hive2://nnnnnnnnnnn:10000/;ssl=true;',
'user': 'xxxxxxxxxx',
'password': 'xxxxxxxxx',
}    

spark.conf     

offers = spark.read.jdbc(dash['jdbcurl'],
                     table='offers', 
                     properties={"user" : dash["user"], 
                                 "password" : dash["password"]})    

offers.count()       returns:  0

offers.show()       
  returns:

+-----------+----------+    
|offers.name|offers.age|    
+-----------+----------+    
+-----------+----------+    

Thanks.

Nitesh
  • 51
  • 4

1 Answers1

0

Yes i was able to see same behaviour with hive jdbc connector. I tried this python connector and it returned correct count.

https://datascience.ibm.com/docs/content/analyze-data/python_load.html

from ingest.Connectors import Connectors

`HiveloadOptions = { Connectors.Hive.HOST                        : 'bi-hadoop-prod-4222.bi.services.us-south.bluemix.net',
                  Connectors.Hive.PORT                      : '10000',
                  Connectors.Hive.SSL                       : True,
                  Connectors.Hive.DATABASE                  : 'default',
                  Connectors.Hive.USERNAME                  : 'charles',
                  Connectors.Hive.PASSWORD                  : 'march14march',
                  Connectors.Hive.SOURCE_TABLE_NAME         : 'student'}

`

`HiveDF = sqlContext.read.format("com.ibm.spark.discover").options(**HiveloadOptions).load()`

HiveDF.printSchema()

HiveDF.show()

HiveDF.count()

Thanks, Charles.

charles gomes
  • 2,145
  • 10
  • 15