2

The Background

The python documentation for the sqlite3 module (here) says:

Usually your SQL operations will need to use values from Python variables. You shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack (see http://xkcd.com/327/ for humorous example of what can go wrong).

Instead, you are supposed to let the sqlite3 library do the substitution for you, using question marks for placeholders, relying on that library's implementation of the DB-API's parameter substitution mechanism.

The documentation follows this up with an example:

# Never do this -- insecure!
symbol = 'RHAT'
c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)

The Question

Using DB-API’s parameter substitution instead of building the query yourself using python string operations is:

a) inherently more secure in some way (please explain)

or

b) "merely" a really great idea because it uses an existing, well-tested library that will do the sanitization for me, saving me from countless hours of debugging and countless bugs and exploits?

Comments and thoughts on the question

I suspect b (not least because of answers like this one). If the answer is a, there is a subtle source of insecurity here that I do not understand, and this insecurity needs to be explained somewhere.

The confusing point is that there is absolutely nothing insecure about the example given. It very clearly reduces (100% of the time) to:

c.execute("SELECT * FROM stocks WHERE symbol = 'RHAT'")

which is perfectly good.

It is bad programming style, because a later user might so easily change symbol to come from an external source, and that would be insecure, like this:

symbol = raw_input("enter a stock symbol")
#symbol comes straight from the user!  what horrors it could contain...
c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)

and even this could be argued to be insecure:

def get_stock(symbol):
    #goodness knows where symbol comes from or whether it has been 
    #sanitized properly yet...
    c.execute("SELECT * FROM stocks WHERE symbol = '%s'" % symbol)

I am looking for confirmation that the original example is not insecure, but is instead bad programming practice because it might easily become insecure. Those are two different things, and it is important to understand the difference between them.

Community
  • 1
  • 1
stochastic
  • 3,155
  • 5
  • 27
  • 42

3 Answers3

4

The cited example from the python docs is indeed and evidently harmless because you have full control over the symbol variable. So it is exactly the same as the following static query, as you say:

"SELECT * FROM stocks WHERE symbol = 'RHAT'"

But even if it is harmless you should not do it without an explicit comment saying why you do it and why it is secure. Because of at least 2 reasons:

  • you could later change the symbol variable to be initialized from an external source (user input) bringing the SQL injection threat
  • if you never do it in any place of your code, you are sure to be immune to SQL injection. Say differently: always use best practices coding even if in some places it looks useless ; coding standards aim to help to reduce error risks.
stochastic
  • 3,155
  • 5
  • 27
  • 42
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • 1
    thanks for the confirmation. As you say, good programming practice is a great idea even if it does seem useless. The docs are confusing because they confuse bad programming practice with actual insecurity, leaving the reader to wonder what the actual insecurity is. – stochastic Dec 04 '15 at 13:55
0

I can tell you nothing more than xkcd said :)

symbol = raw_input("enter a stock symbol")

with input

'; DELETE FROM stocks; SELECT * FROM stocks WHERE symbol='

concatenated using

"SELECT * FROM stocks WHERE symbol = '%s'" % symbol

will create valid SQL query with unexpected result ;)

poko
  • 575
  • 2
  • 8
  • `sqlite3.Warning: You can only execute one statement at a time.` – CL. Dec 04 '15 at 13:30
  • Indeed, that xkcd comic does a masterful job :-) You have spelled out why my second-to-last example is insecure, but I already knew that, and that is not the question. The question (much abbreviated) is whether the original "symbol='RHAT'" version is insecure, and why? – stochastic Dec 04 '15 at 13:32
  • A better example: `"' AND FALSE UNION ALL SELECT * FROM users; --"` (adjust number of columns selected from users until they match the number of columns selected from stocks). Allows you to peek at any table in the database. – Colonel Thirty Two Dec 04 '15 at 15:00
0

That example indeed is secure. However, that is not what a real program would do, and is just meant to show that symbol has some value.

In practice, the value would come from outside the program, and this be insecure. And even if the program were using a constant value like this, it's likely that later additions and extensions would add dynamic symbol selection, and whoever would make that change would forget to look at the SQL.

CL.
  • 173,858
  • 17
  • 217
  • 259