62

I have a list of ids of rows to fetch from database. I'm using python and psycopg2, and my problem is how to effectively pass those ids to SQL? I mean that if I know the length of that list, it is pretty easy because I can always manually or automatically add as many "%s" expressions into query string as needed, but here I don't know how much of them I need. It is important that I need to select that rows using sql "id IN (id1, id2, ...)" statement. I know that it is possible to check the length of the list and concatenate suitable number of "%s" into query string, but I'm afraid that it would be very slow and ugly. Does anyone have an idea on how to solve it? And please don't ask why I need to do it with "IN" statement - it is a benchmark which is a part of my class assignment. Thanks in advance!

jcollado
  • 39,419
  • 8
  • 102
  • 133
k_wisniewski
  • 2,439
  • 3
  • 24
  • 31
  • Are you opposed to a SQL answer? Using dynamic SQL, you can give a string of any length and have SQL correctly read it. – Twelfth Dec 29 '11 at 18:41

3 Answers3

96

Python tuples are converted to sql lists in psycopg2:

cur.mogrify("SELECT * FROM table WHERE column IN %s;", ((1,2,3),))

would output

'SELECT * FROM table WHERE column IN (1,2,3);'

For Python newcomers: It is unfortunately important to use a tuple, not a list here. Here's a second example:

cur.mogrify("SELECT * FROM table WHERE column IN %s;", 
    tuple([row[0] for row in rows]))
Andrew McDowell
  • 2,860
  • 1
  • 17
  • 31
philofinfinitejest
  • 3,987
  • 1
  • 24
  • 22
  • Thank you, I didn't notice that. The only issue with your solution is that you have forgotten comma in that tuple. – k_wisniewski Dec 29 '11 at 18:53
  • 34
    Note that Python LISTS will be converted to Postgres ARRAY types so I find that I frequently need to do something like (tuple(SOME_LIST),) in my cursor.execute(...) arguments. Note how we're wrapping the tuple() of the list in a single element literal tuple, so we have a tuple of tuples as shown in this example. – Jim Dennis Jul 05 '13 at 19:27
  • ```cur.mogrify("SELECT * FROM table WHERE column IN %s;", tuple([row[0] for row in rows]))``` this is unclear. It doesn't seem to relate to the previous example. – MrR Nov 15 '19 at 13:59
14

this question is old and maybe there is a newer one out there, but the answer my colleagues are going with right now is this:

sql = "SELECT * FROM table WHERE column = ANY(%(parameter_array)s)"
cur.execute(sql,{"parameter_array": [1, 2, 3]})
Brandon Henry
  • 3,632
  • 2
  • 25
  • 30
  • 1
    This worked for me for a special case where I had to use `$1` instead of `%s` while the accepted answer didn't. Thanks for posting this! – Seperman Mar 11 '21 at 07:16
12

Now sql module of psycopg2 (https://www.psycopg.org/docs/sql.html) can be used to safeguard against errors and injections, like e.g.:

import psycopg2
from psycopg2 import sql

params = config()
conn = psycopg2.connect(**params)
cur = conn.cursor()

ids = ['a','b','c']
sql_query = sql.SQL('SELECT * FROM {} WHERE id IN ({});').format(
                    sql.Identifier('table_name'),
                    sql.SQL(',').join(map(sql.Literal, ids))
                )
print (sql_query.as_string(cur)) # for debug
cur.execute(sql_query)

from configparser import ConfigParser
def config(filename='database.ini', section='postgresql'):
    # create a parser
    parser = ConfigParser()
    # read config file
    parser.read(filename)

    # get section, default to postgresql
    db = {}
    if parser.has_section(section):
        params = parser.items(section)
        for param in params:
            db[param[0]] = param[1]
    else:
        raise Exception('Section {0} not found in the {1} file'.format(section, filename))

    return db

Note: sql.Identifier will add quotes if needed so it will work if you use quoted identifiers in PostgreSQL also (they have to be used to allow e.g. case sensitive naming).

Example and structure of database.ini:

[postgresql]
host=localhost
port=5432
database=postgres
user=user
password=mypass
scribu
  • 2,958
  • 4
  • 34
  • 44
Alex Martian
  • 3,423
  • 7
  • 36
  • 71
  • +1 passing `sql.SQL(',').join(map(sql.Literal, your_data)` as the `format()` parameter seems the most query flexible & pythonic approach seen here so far. – eliangius Mar 21 '22 at 20:46