Regex to check for semicolon or comment characters to prevent SQL injection

Question

I want to prevent SQL injections by using regex to check the input a user gives to ensure they do not do anything malicious.

I am doing this in a python program -- I specifically need help with finding the regex terms that check if there is a ;, /*, or */ anywhere in the input string. I am asking this because I am curious how to do it for my own knowledge, even if this may not be the best way to go about preventing SQL injections.

I iterate over a dictionary and check to see if any terms meet the 'malicious term' criteria. This was the best way I could think of at the moment to prevent injections; admittedly I am not well versed in SQL and know of only the basic types of injections. If anyone thinks of other possible injections to be aware of it would be greatly appreciated.

Example:

import re

regex_terms = ["(D|d)(R|r)(O|o)(P|p).*(T|t)(A|a)(B|b)(L|l)(E|e)", "(I|i)(N|n)(S|s)(E|e)(R|r)(T|t).*(I|i)(N|n)(T|t)(O|o)"]

user_input = {"first name": "Bob",
              "last name": "DROP TABLE",
              "email": "; malicious statement here",  # note the semicolon at the beginning -- I would want to be able to look for it anywhere in the string
              "example key": "/* */",
              "other example": "*/"}

for key in user_input:
    
    for regex_term in regex_terms:
        if re.search(regex_term, user_input):
            print(f"Malicious Input: {user_input}")

A couple of points. 1. Regex can handle case-insensitive search. For example try `re.search('INSERT', 'BLAH InSeRT BLAH', re.IGNORECASE)`. 2. Protecting against SQL injection is a solved problem. It's far better to use existing solutions rather than rolling your own. — Steven Rumbalski, Sep 13 '22 at 21:20
Security is too important to fix yourself, use existing solutions that have been tested, validated and are maintained. Unless this is a hobby project and you just want to learn about SQL injection, then have at it. — Swier, Sep 14 '22 at 08:03
Why nobody mentioned that this method of "ensuring that input do not do anything malicious" is **totally flawed?** — Your Common Sense, Sep 14 '22 at 08:08
Does this answer your question? [Protecting against SQL injection in python](https://stackoverflow.com/questions/10950362/protecting-against-sql-injection-in-python) — Your Common Sense, Sep 14 '22 at 08:10

score 2 · Answer 1 · answered Sep 15 '22 at 15:36

Using regular expressions to reject individual values is not the right solution to the problem of SQL injection. Companies such as GreenSQL and Oracle have tried this, and eventually given up.

The reason it's the wrong approach is that to cover all cases of potential SQL injection, you end up rejecting legitimate content.

For example, your own post above contains the string "DROP TABLE". How would Stack Overflow be able to insert this into the database if it used a regular expression to match those words, and rejected the content if it matched?

We've also seen software solutions that rejected any SQL query that used a boolean OR operator, because that operator is seen in some SQL injection attacks. So forget about using OR in any dynamic SQL query.

These sorts of regular expression checks are unnecessary if you use query parameters to separate data values from statements. That is, you write your SQL query with parameter placeholders:

INSERT INTO MyTable (first_name, last_name, email, ...)
VALUES (?, ?, ?, ...)

Prepare the query, which parses the SQL. Then execute the query, and bind values to the placeholders at that time. Binding parameters is not string substitution. The query has already been parsed, so even if the values contain expressions or semicolons or "DROP TABLE", there's no way it can be executed as SQL syntax. It will only be treated as a string value.

Regex to check for semicolon or comment characters to prevent SQL injection

1 Answers1