9

I’m building an app using Python and the clean architecture principles, with TDD.

Some unit tests require executing some raw SQL queries against an in-memory database.

I am trying to switch from sqlite to postgresql inmemory data, using pytest-postgres.

Problem

  • When using sqlite inmemory database, I can both insert and select data.
  • When using Postgresql inmemory database, I can only SELECT (raw INSERT fails).

Insert work in sqlite…

    s_tb_name = "tb_customer"
    ls_cols = ["first_name", "last_name", "email"]
    ls_vals = ['("John", "Doe", "john.doe@mail.net")',
               '("Jane", "Doe", "jane.doe@mail.net")',
               '("Eric", "Dal", "eric.d@home.com")']
    s_cols = ', '.join(ls_cols)
    s_vals = ', '.join(ls_vals)
    session.execute(f"INSERT INTO {s_tb_name} ({s_cols}) VALUES ({s_vals})")

…but fail in Postgres:

E       sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column "John" does not exist
E       LINE 1: ..., email) VALUES (("John"....

From this psycopg documentation page, I understand this is due to pyscopg2.
It prevents injecting raw dynamic SQL, and it seems I should add this :

tb_sql_id = sql.Identifier(s_tb_name)
cols_sql_id = sql.SQL(' ,').join(map(sql.Identifier, ls_cols))
vals_sql_id = sql.SQL(' ,').join(map(sql.Literal, ls_vals))
psycopg2_query = sql.SQL(f"INSERT INTO {tb_sql_id} ({cols_sql_id}) VALUES ({vals_sql_id})")

but logically, sqlalchemy refuses to execute the psycopg2_query :

sqlalchemy.exc.ArgumentError: SQL expression object expected, got object of type <class 'psycopg2.sql.SQL'> instead

Question

Is there a way to execute raw dynamic insert queries in Postgres using SQL Alchemy?

Lionel Hamayon
  • 1,240
  • 15
  • 25
  • 1
    Im not aware of the syntax but, you should never use interpolation when doing queries because they open an window for sql injection vulnerability. In psycopg2 you can cursor.execute("select foo from bar where zar = %s and tar = %s", (zar, tar)). sqlalchemy seems to be the same but with session.execute(qry, parmstuple) instead – geckos Sep 17 '20 at 11:05

2 Answers2

5

I am compelled to warn you about SQL injection, but since this is for your tests that should not be a concern.

Two changes are needed:

  1. The values in ls_vals need to be enclosed in single, rather than double quotes
  2. The extra parens after VALUES need to be removed
    s_tb_name = "tb_customer"
    ls_cols = ["first_name", "last_name", "email"]
    ls_vals = ["('John', 'Doe', 'john.doe@mail.net')",
               "('Jane', 'Doe', 'jane.doe@mail.net')",
               "('Eric', 'Dal', 'eric.d@home.com')"]
    s_cols = ', '.join(ls_cols)
    s_vals = ', '.join(ls_vals)
    session.execute(f"INSERT INTO {s_tb_name} ({s_cols}) VALUES {s_vals}")
Mike Organek
  • 11,647
  • 3
  • 11
  • 26
5

As pointed by others, injecting SQL like this is to be avoided in most cases.

Here, the SQL is written in the unit test itself. There is no external input leaking to the SQL injection, which alleviates the security risk.

Mike Organek’s solution did not fully work for me, but it pointed me to the right direction : I just had to also remove the parens from ls_vals.

    s_tb_name = "tb_customer"
    ls_cols = ["first_name", "last_name", "email"]
    ls_vals = ["'John', 'Doe', 'john.doe@mail.net'",
               "'Jane', 'Doe', 'jane.doe@mail.net'",
               "'Eric', 'Dal', 'eric.d@home.com'"]
    s_cols = ', '.join(ls_cols)
    s_vals = '(' + '), ('.join(ls_vals) + ')'
    session.execute(f"INSERT INTO {s_tb_name} ({s_cols}) VALUES {s_vals}")

This made the insert test pass, both when using the sqlite engine and the postgres engine.

Lionel Hamayon
  • 1,240
  • 15
  • 25
  • 3
    Injection is not always intentional. Manually formatting values to SQL is all in all more error prone, compared to just using the library's placeholders. In case of PostgreSQL psycopg2 even offers tools for passing identifiers safely, not to mention SQLAlchemy Core. – Ilja Everilä Sep 17 '20 at 16:24
  • Thank you for your feedback. I failed to mention in my quiestion that I first tried to use SQL Alchemy placeholder, in the form :variable_name, but that gave me the same error as the modified version that I posted. – Lionel Hamayon Sep 17 '20 at 18:55