2

I have a Thrift database running on Apache Spark 3.1.2 where I have created a Table and inserted values using beeline. It looks like this:

0: jdbc:hive2://localhost:10000/> select * from mydb4.test;
+-------+--------+
|  key  | value  |
+-------+--------+
| 1235  | test4  |
| 123   | test   |
+-------+--------+

However, when I try fetch this using pyspark, the returned column names are the following:

database = "mydb4"
table = "test"
jdbcDF = spark.read.format("jdbc") \
    .option("url", f"jdbc:hive2://<URL>/mydb4") \
    .option("dbtable", table) \
    .load()

jdbcDF.select("key").show()

+---+-----+
|key|value|
+---+-----+
|key|value|
|key|value|
+---+-----+

Why can't I see the proper values in the returned table? I am only seeing the column names instead of values.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
toerq
  • 117
  • 2
  • 10

0 Answers0