5

When we fetch data using the DB API 2.0 cur.execute("select * from db.table") we get a cursor which seems like a generator object of list of tuples.

Whereas in pymongo, when we fetch we get it as list of dictionaries. I wanted to achieve something like this.

Instead of fetching list of tuples, I wanted list of dictionaries or a named tuple.

I believe from an efficiency point of view it makes sense, since the schema is already defined so no need to send it for every record.

Currently the workaround I am using is:

cur.execute("select * from db.table")
columns = cur.columns_with_types
data = cur.fetchall()
df = pd.DataFrame(data,columns=[tuple[0] for tuple in columns])
data_reqd = df.to_dict('records')

This methods fairs poorly when query returns a lot of data.

Workaround 1: Use fetchmany(size=block_size) but it doesn't seem like an elegant way to do things.

Workaround 2: This seems like a much better way to handle things.

cur.execute("select * from db.table")
columns = cur.columns_with_types
for tup in cur:
     row = dict(zip(columns, tup))
    # use row

Any good way to handle this? Any improvements to the question are appreciated.

1 Answers1

0

You can alternatively create a Client and call its query_dataframe method.

import clickhouse_driver as ch

ch_client = ch.Client(host='localhost')
df = ch_client.query_dataframe('select * from db.table')
records = df.to_dict('records')
chicxulub
  • 210
  • 2
  • 5