r1 = duckdb.query("""
SELECT f1 FROM parquet_scan('test.pq') WHERE f2 > 1
""")
Does not create a table called r1, but actually creates a relation that is nothing else than an execution plan. Hence if you call an execute on that, would actually execute a query that scans the parquet file
result = r1.execute()
If you want to query it as a table, you basically have two options.
- You create a view from your relation
r1.create_view('table_name')
- You change your SQL query to create a duckdb table
conn = duckdb.connect()
conn.execute("create table t as SELECT f1 FROM parquet_scan('test.pq') where f2 > 1 ")
Note that in 1 you will actually load the parquet data to a Duck table, while with 2 you will be constantly reading the parquet data itself.
Finally, if you just want to stack up filters, then you could do:
r2 = r1.filter("f1>10")
There is more info on the Python Relational Api on Duckdb's website, more specifically at:
https://duckdb.org/docs/api/python
https://github.com/duckdb/duckdb/blob/master/examples/python/duckdb-python.py
Hopefully, that was helpful! ;-)