Getting column names in the rows when querying Thrift database from pyspark

Asked Sep 10 '21 at 14:24

Active Nov 04 '21 at 20:31

Viewed 104 times

I have a Thrift database running on Apache Spark 3.1.2 where I have created a Table and inserted values using beeline. It looks like this:

0: jdbc:hive2://localhost:10000/> select * from mydb4.test;
+-------+--------+
|  key  | value  |
+-------+--------+
| 1235  | test4  |
| 123   | test   |
+-------+--------+

However, when I try fetch this using pyspark, the returned column names are the following:

database = "mydb4"
table = "test"
jdbcDF = spark.read.format("jdbc") \
    .option("url", f"jdbc:hive2://<URL>/mydb4") \
    .option("dbtable", table) \
    .load()

jdbcDF.select("key").show()

+---+-----+
|key|value|
+---+-----+
|key|value|
|key|value|
+---+-----+

Why can't I see the proper values in the returned table? I am only seeing the column names instead of values.

edited Nov 04 '21 at 20:31

marc_s

732,580
175
1,330
1,459

asked Sep 10 '21 at 14:24

toerq

Getting column names in the rows when querying Thrift database from pyspark

0 Answers0