3

I'm trying to get my Clickhouse data in my Django project. I'm using clickhouse_driver and :

client.execute('SELECT * FROM myTable LIMIT 5 FORMAT JSON')

When I execute this command in my Clickhouse server SELECT * FROM myTable LIMIT 5 FORMAT JSON it outputs in JSON format. But in python, when I try it with clickhouse_driver it outputs only fields like:

[('2020213','qwerty','asdfg'),('2030103','qweasd','asdxv')]

But I want key-value json format..like

{"logdate":"2020213","host":"qwerty","cef":"asdfg"}

Any suggestions to resolve this problem? Or maybe I have to search for an alternate clickhouse_driver..

Thx.

jkdba
  • 2,378
  • 3
  • 23
  • 33

4 Answers4

4

clickhouse-driver ignores FORMAT-clause (see Selecting data).

It can be done either manually by combining column name with related value:

from clickhouse_driver import Client
from json import dumps

client = Client(host='localhost')

data = client.execute_iter('SELECT * FROM system.functions LIMIT 5', with_column_types=True)
columns = [column[0] for column in next(data)]

for row in data:
    json = dumps(dict(zip(columns, [value for value in row])))
    print(f'''{json}''')

# Result:
# {"name": "fromUnixTimestamp64Nano", "is_aggregate": 0, "case_insensitive": 0, "alias_to": ""}
# {"name": "toUnixTimestamp64Nano", "is_aggregate": 0, "case_insensitive": 0, "alias_to": ""}
# {"name": "toUnixTimestamp64Micro", "is_aggregate": 0, "case_insensitive": 0, "alias_to": ""}
# {"name": "sumburConsistentHash", "is_aggregate": 0, "case_insensitive": 0, "alias_to": ""}
# {"name": "yandexConsistentHash", "is_aggregate": 0, "case_insensitive": 0, "alias_to": ""}

or using pandas:

from clickhouse_driver import Client
import pandas as pd

client = Client(host='localhost')

data = client.execute_iter('SELECT * FROM system.functions LIMIT 5', with_column_types=True)
columns = [column[0] for column in next(data)]

df = pd.DataFrame.from_records(data, columns=columns)
print(df.to_json(orient='records'))
# Result 
# [{"name":"fromUnixTimestamp64Nano","is_aggregate":0,"case_insensitive":0,"alias_to":""},{"name":"toUnixTimestamp64Nano","is_aggregate":0,"case_insensitive":0,"alias_to":""},{"name":"toUnixTimestamp64Micro","is_aggregate":0,"case_insensitive":0,"alias_to":""},{"name":"sumburConsistentHash","is_aggregate":0,"case_insensitive":0,"alias_to":""},{"name":"yandexConsistentHash","is_aggregate":0,"case_insensitive":0,"alias_to":""}]
vladimir
  • 13,428
  • 2
  • 44
  • 70
3

I did not try Vladimir solution but here is the my solution :

client.execute commands gives us " with_column_types=True " parameters.. it gives us metadata for table. after :

result , columns = client.execute('SELECT * FROM myTbl LIMIT 5',with_column_types=True)
df=pandas.DataFrame(result,columns=[tuple[0] for tuple in columns])
dfJson=df.to_json(orient='records')

and this gives us our what we want.

Thx for suggestions :)

0
client.query_dataframe(sql).to_json(orient='records',default_handler=str)
Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
  • Welcome to Stack Overflow! While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. –  Apr 27 '22 at 21:22
0

You can use clickhouse driver DBAPI and DictCursor. Following is minimum code for illustration.

>>> from clickhouse_driver.dbapi import connect
>>> from clickhouse_driver.dbapi.extras import DictCursor

>>> conn = connect(host='localhost', user='default', password='', database='')
>>> cursor = c.cursor(DictCursor)
>>> cursor.execute('create table test(i Int, s String) engine Memory')
>>> cursor.execute('insert into test values', [(1,'a'),(2,'b')])
>>> cursor.execute('select * from test')
>>> cursor.fetchall()
[{'i': 1, 's': 'a'}, {'i': 2, 's': 'b'}]
jayvynl
  • 66
  • 3