1

I'm trying to copy a couple of tables from one database ("db1", PostgreSQL) to another ("db2", SQL Server). Unfortunately, I face an issue due to the BOOLEAN type for some fields in the PostgreSQL database which is not recognized as a valid type for SQL Server.

Here is my code sample:

db2_engine = "postgresql+psycopg2://" + str(db2_user) + ":" + str(db2_password) + "@" + str(db2_host) + ":" + str(db2_port) + "/" + str(db2_database)
db2 = sqlalchemy.create_engine(db2_engine)

lst_tablename_totr = ["contract",
                     "subscription",
                     "contractdelivery",
                     "businesspartner"
                     ]

for table_name in lst_tablename_totr:
    table = Table(table_name, metadata, autoload=True, autoload_with=db2)
    table.create(bind=db1)
    query = """
    SELECT
        *
    FROM """ + str(table_name) + """
    """
    df_hg = pd.read_sql(query, db2_engine)
    df_hg.to_sql(table_name, db1, schema='dbo', index=False, if_exists='append')

For now, the issue is located to the table = Table(table_name, metadata, autoload=True, autoload_with=db_hgzl) table.create(bind=db1) part of the code.

Here is the error message:

ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Column, parameter or variable #8\xa0: data type BOOLEAN not found. (2715) (SQLExecDirectW)')

I couldn't find any way to force the conversion between PostgreSQL Boolean type and SQL Server Bit type.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Nugg
  • 11
  • 4
  • I am somewhat confused as you say db2 is MSQL yet: `db2_engine = "postgresql+psycopg2://"` – Adrian Klaver Aug 06 '21 at 15:47
  • If you use the Postgres [ODBC](https://odbc.postgresql.org/) there is a config setting for this [Config](https://odbc.postgresql.org/docs/config.html). If you want to use `psycopg2` directly you can create and register a [type caster](https://www.psycopg.org/docs/advanced.html#type-casting-of-sql-types-into-python-objects) to cast to `int`. – Adrian Klaver Aug 06 '21 at 16:27

1 Answers1

1

You are seeing a difference between SQLAlchemy's dialect-specific BOOLEAN type and its generic Boolean type. For an existing PostgreSQL table

CREATE TABLE IF NOT EXISTS public.so68683260
(
    id character varying(5) COLLATE pg_catalog."default" NOT NULL,
    bool_col boolean NOT NULL,
    CONSTRAINT so68683260_pkey PRIMARY KEY (id)
)

if we reflect the table then the boolean columns are defined as BOOLEAN

tbl = sa.Table(table_name, sa.MetaData(), autoload_with=pg_engine)
print(type(tbl.columns["bool_col"].type))
# <class 'sqlalchemy.sql.sqltypes.BOOLEAN'>

and then if we try to create the table in SQL Server we end up doing the equivalent of

tbl = sa.Table(
    table_name,
    sa.MetaData(),
    sa.Column("id", sa.VARCHAR(5), primary_key=True),
    sa.Column("bool_col", sa.BOOLEAN, nullable=False),
)

tbl.drop(ms_engine, checkfirst=True)
tbl.create(ms_engine)

and that fails with the error you cite because the DDL rendered is

CREATE TABLE so68683260 (
    id VARCHAR(5) NOT NULL, 
    bool_col BOOLEAN NOT NULL, 
    PRIMARY KEY (id)
)

However, if we use the generic Boolean type

tbl = sa.Table(
    table_name,
    sa.MetaData(),
    sa.Column("id", sa.VARCHAR(5), primary_key=True),
    sa.Column("bool_col", sa.Boolean, nullable=False),
)

tbl.drop(ms_engine, checkfirst=True)
tbl.create(ms_engine)

we are successful because the DDL rendered is

CREATE TABLE so68683260 (
    id VARCHAR(5) NOT NULL, 
    bool_col BIT NOT NULL, 
    PRIMARY KEY (id)
)

and BIT is the valid corresponding column type in T-SQL.

Feel free to open a SQLAlchemy issue if you believe that this behaviour should be changed.

[Note also that the text column is VARCHAR(5) because the table uses the default encoding for my PostgreSQL test database (UTF8), but creating the table in SQL Server will create a VARCHAR (non-Unicode) column instead of a NVARCHAR (Unicode) column.]

Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
  • I was with you until you got to the end and varchar. Postgres varchar(or other string types) is not always Unicode, see here [Character Sets](https://www.postgresql.org/docs/current/multibyte.html): "The character set support in PostgreSQL allows you to store text in a variety of character sets (also called encodings), including single-byte character sets such as the ISO 8859 series and multiple-byte character sets such as EUC (Extended Unix Code), UTF-8, and Mule internal code." Even if that where true I'm not understanding why it would be `VARCHAR` instead of `NVARCHAR`? – Adrian Klaver Aug 07 '21 at 16:24
  • @AdrianKlaver - Thanks for the correction. Unlike SQL Server, PostgreSQL does not have separate VARCHAR/NVARCHAR types and the column is reflected as VARCHAR. When SQL Server sees `VARCHAR` it creates a non-Unicode column. I suspect that the converse would be true: a SQL Server table could have a column reflected as NVARCHAR and that probably would cause PostgreSQL to throw an error like 'type "nvarchar" does not exist'. – Gord Thompson Aug 07 '21 at 17:12