1

I am writing this code to get the integer value of count in specified table:

sc = SparkContext("local", "spar")
hive_context = HiveContext(sc)
hive_context.sql("use zs_trainings_trainings_db")
df = hive_context.sql("select count(*) from ldg_sales")
lmiguelvargasf
  • 63,191
  • 45
  • 217
  • 228

2 Answers2

2

Either:

hive_context.table("sales").count

or

hive_context.sql("select count(*) from ldg_sales").first()[0]
1

convert dataframe to rdd so you can run map task on it to just extract row values like -

df = hive_context.sql("select count(*) as cnt from ldg_sales")
count = df.rdd.map(lambda _ : _.cnt).collect()[0]
Pushkr
  • 3,591
  • 18
  • 31