2

I have a wide table in a sqlite3 database, and I wish to dynamically query certain columns in a Python script. I know that it's bad to inject parameters by string concatenation, so I tried to use parameter substitution instead.

I find that, when I use parameter substitution to supply a column name, I get unexpected results. A minimal example:

import sqlite3 as lite

db = lite.connect("mre.sqlite")
c = db.cursor()

# Insert some dummy rows
c.execute("CREATE TABLE trouble (value real)")
c.execute("INSERT INTO trouble (value) VALUES (2)")
c.execute("INSERT INTO trouble (value) VALUES (4)")
db.commit()


for row in c.execute("SELECT AVG(value) FROM trouble"):
    print row   # Returns 3

for row in c.execute("SELECT AVG(:name) FROM trouble", {"name" : "value"}):
    print row   # Returns 0

db.close()

Is there a better way to accomplish this than simply injecting a column name into a string and running it?

David Bruce Borenstein
  • 1,655
  • 2
  • 19
  • 34

2 Answers2

1

As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:

It comes down to what "safe" means. The conventional wisdom is that using normal python string manipulation to put values into your queries is not "safe". This is because there are all sorts of things that can go wrong if you do that, and such data very often comes from the user and is not in your control. You need a 100% reliable way of escaping these values properly so that a user cannot inject SQL in a data value and have the database execute it. So the library writers do this job; you never should.

If, however, you're writing generic helper code to operate on things in databases, then these considerations don't apply as much. You are implicitly giving anyone who can call such code access to everything in the database; that's the point of the helper code. So now the safety concern is making sure that user-generated data can never be used in such code. This is a general security issue in coding, and is just the same problem as blindly execing a user-input string. It's a distinct issue from inserting values into your queries, because there you want to be able to safely handle user-input data.

So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.

Community
  • 1
  • 1
David Bruce Borenstein
  • 1,655
  • 2
  • 19
  • 34
  • "This isn't a security hole" does not mean "there isn't a problem". Giving the caller an error message that isn't confusing is certainly a problem. As is eliminating accidentally correct syntax that results from typos. – user541686 Sep 07 '19 at 03:16
0

Why not use string formatting?

for row in c.execute("SELECT AVG({name}) FROM trouble".format(**{"name" : "value"})):
    print row # => (3.0,)
Goodies
  • 4,439
  • 3
  • 31
  • 57