0

I am coding in Python. I have query like this:

SELECT device_os, count(*) FROM warehouse WHERE device_year <= 2016 GROUP BY device_os;

Now, I have some additional filters which are coming dynamically from the user. For example,

device_id IN (15, 85, 65) OR device_model in ('MAX', 'SHARP', 'AD')

I have these extra conditions which I want to apply to the query. So, the final query should become:

SELECT device_os, count(*) FROM warehouse WHERE device_year <= 2016 AND (device_id IN (15, 85, 65) OR device_model in ('MAX', 'SHARP', 'AD')) GROUP BY device_os;

I have searched about sqlparse but has not been successful. How can I make it look like what I want?

Muhammad Muaaz
  • 164
  • 1
  • 13
  • You need to clarify the roadblock here, it's not clear why you can't just use the final query? ex. input sanitization, string formatting, etc. – ᴓᴓᴓ Oct 12 '21 at 22:33
  • @talfreds because the basic query is already saved somewhere, and I need to add extra filters in where clause which are coming dynamically. It is assured that the user does not enter any malicious input. The user does not have a direct control to the database. – Muhammad Muaaz Oct 12 '21 at 22:38
  • I think you have the options of parsing the user text (or making it, `field` + `list of values` rather than an entire expression, then it should be easy), or using a limited user that only has `SELECT ON db.warehouse` privs (is that restricted enough?). – danblack Oct 12 '21 at 22:45

2 Answers2

1

I made a library called SQLGlot that makes this very easy.

import sqlglot
import sqlglot.expressions as exp

expression = sqlglot.parse_one("""
SELECT device_os, count(*) FROM warehouse WHERE device_year <= 2016 GROUP BY device_os
""") 

def transform(node):
    if isinstance(node, exp.Where):
        node.args["this"] = sqlglot.parse_one(f"({node.this.sql()}) AND device_model in (1,2,3)")
    return node

print(expression.transform(transform).sql(pretty=True))

# output
SELECT
  device_os,
  COUNT(*)
FROM warehouse
WHERE
  (device_year <= 2016)
  AND device_model IN (1, 2, 3)
GROUP BY
  device_os
Toby Mao
  • 374
  • 2
  • 6
0

Define the SQL as a template format string, then substutute the dynamically generated conditions into it.

sql_format = 'SELECT device_os, count(*) FROM warehouse WHERE device_year <= 2016 AND ({}) GROUP BY device_os;'
where_clause = 'device_id IN (%s, %s, %s) OR device_model in (%s, %s, %s)'
sql = sql_format.format(where_clause)
params = [15, 85, 65, 'MAX', 'SHARP', 'AD']
cursor.execute(sql, params)

If you don't have any additional conditions, you can use

where_clause = '1=1'

This is a condition that's always true, so it doesn't change the result.

Barmar
  • 741,623
  • 53
  • 500
  • 612