0

I am using PySpark to connect to my Kudu database. I want to retrieve a min value in a column with a set of predicates. Can't seem to find an option in the API

client = kudu.connect(host="myhost", port=1234)
table = client.table("impala::mydb.mytable")
scanner = table.scanner()
scanner.add_predicates([table['col1'] == 'test'])
scanner.set_project_column_names(['amount'])
myList = scanner.open().read_all_tuples()

The above will retrieve a list but I am not sure how to specify that I want the MIN value for amount column.

Tried

scanner.set_project_column_names([MIN('amount')])

but that results in MIN is not defined error.

Alper t. Turker
  • 34,230
  • 9
  • 83
  • 115
rams
  • 6,381
  • 8
  • 46
  • 65

1 Answers1

0

From your example it looks like you are using Impala already. You can use MIN function in Impala SQL to get the minimum value. E.g.:

SELECT MIN(amount) FROM mydb.mytable
Greg S
  • 466
  • 3
  • 5