3

In the following example, I would like to dynamically populate the values of the CTE in the query from a list, using the standard Python SQLite3 API:

-- test schema
CREATE TABLE t AS SELECT "" AS k FROM (VALUES (1),(3));

-- query to build
WITH cte(k) AS (VALUES
    (1),
    (2),
    (3)
) SELECT * FROM t INNER JOIN cte USING(k);

This doesn't work:

import sqlite3

SQL_INIT="""
CREATE TABLE t AS SELECT "" AS k FROM (VALUES (1),(3));
"""

SQL_QUERY="""
WITH cte(k) AS (VALUES
    (:values)
) SELECT * FROM t INNER JOIN cte USING(k);
"""
conn = sqlite3.connect(':memory:')
conn.executescript(SQL_INIT)

params = dict(
    values=(1,2,3)
)
for row in conn.execute(SQL_QUERY, params):
    print(row)

So is there a way to use placeholders to populate the CTE values from the Python code? If no, what would be a good workaround? Do I really have to resort to plain string manipulation and manual values sanitization?

For the sake of simplicity, the example above populates the CTE with integer values. But the solution should work with numbers and strings.

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • How about creating temporary table and use plain old INSERT. `CREATE TEMP TABLE cte(k INT); INSERT INTO cte(k) VALUES (:value); SELECT * FROM t JOIN cte ...` – Lukasz Szozda Mar 14 '20 at 11:21
  • Yes, I thought about that. But I don't like the idea because of potential issues with concurrent access (the db is shared between two processes). Unless TEMP TABLE are connection=specific? – Sylvain Leroux Mar 14 '20 at 11:23
  • 1
    Ok, then the other option is provide values as single stirng `1,2,3` and split them in SQLite. [String_split idea for SQL Server](https://stackoverflow.com/questions/337704/parameterize-an-sql-in-clause/36980115#36980115) – Lukasz Szozda Mar 14 '20 at 11:24
  • That could be an option Lukasz. With the assumption, the field separation is not in the data. In the example, for the sake of simplicity, I used numbers, but in the real app, the data may be strings. – Sylvain Leroux Mar 14 '20 at 11:28
  • Yes, I know it is a tricky one. There is no quoting mechanism so the separator must be distinctive – Lukasz Szozda Mar 14 '20 at 11:28
  • You can handle this completely from Python. Whether that is the best option for you is another story; there is something to be said for keeping all your data in the database. – Tim Biegeleisen Mar 14 '20 at 11:31

2 Answers2

1

I don't actually know if you can use CTE syntax this way from the Python API. Assuming you can, then the problem you are facing is that you are trying to bind a collection to a single placeholder in the query. Instead, you may try binding the collection to a prepared statement which has the correct number of ? placeholders.

vals = (1, 2, 3)
vals_clause = '(?)' + ',(?)'*(len(vals)-1)
sql = 'WITH cte(k) AS (VALUES ' + vals_clause + ') '
sql = sql + 'SELECT * FROM t INNER JOIN cte USING(k)'
for row in conn.execute(SQL_QUERY, vals):
    print(row)

The SQL query generated by the above script is:

WITH cte(k) AS (VALUES (?),(?),(?))
SELECT * FROM t INNER JOIN cte USING(k)

You can see that we generated however many ? placeholders are needed to cover the tuple you pass to the call to execute().

If you can't use the above solution, then it could be perhaps that the API you are using doesn't allow for CTE. We can still rephrase your query as this:

SELECT *
FROM t INNER JOIN
(
    SELECT 1 AS k UNION ALL
    SELECT 2 UNION ALL
    SELECT 3
) cte USING(k);

In this case, we are just using a formal subquery to inline the values in a makeshift table. The scripting approach you would use to generate the above for a dynamic tuple of literal values is identical to what I already presented above.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

One solution, while not a very elegant one, would be to dynamically create the list of placeholders:

SQL_QUERY_HEAD="""
WITH cte(k) AS (VALUES
"""

SQL_QUERY_TAIL="""
) SELECT * FROM t INNER JOIN cte USING(k);
"""

[...]

values=(1,2,3)

SQL_QUERY=(
    SQL_QUERY_HEAD +
    ",".join(len(values)*("(?)",)) +
    SQL_QUERY_TAIL
)
Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • 1
    It is an elegant solution, and we are always bound by the contract of the SQLite API, which probably doesn't support binding Python collections as you were trying above. – Tim Biegeleisen Mar 14 '20 at 11:44
  • Thanks @Tim. I wasn't very sure about the idea of dynamically re-creating the query string. But you're right. After all, it is not _that_ bad. – Sylvain Leroux Mar 14 '20 at 11:53
  • 1
    If it feels a bit hackish, is is because...it _is_ a bit hackish. SQL is not a language which was mainly created for generating new data, but rather for querying data that already exists. – Tim Biegeleisen Mar 14 '20 at 12:01