1

I'm running a migration which inserts ~160k records into a PostgreSQL database. This is taking between 40-50 seconds to complete. Unfortunately, this breaks the build process in CircleCI and therefore makes every build after it fail. I also have no direct access to the database and am forced to do this in a migration.

Package versions, if relevant:

  • Python: 3.7.2
  • Flask: 1.1.2,
  • Flask-SQLAlchemy: 2.5.1
  • psycopg2-binary: 2.8.6
  • SQLAlchemy: 1.3.23

I've been reading about the executemany='values' flag on create_engine (engine = create_engine("postgresql+psycopg2://scott:tiger@host/dbname", xecutemany_mode='values_plus_batch'), for example) and I'd like to use it for this migration, since it seems it would help, but if I do so, the process simply hangs (inspecting PostgreSQL's process list hints at the session being locked on the first insert statement, but I have no idea how to close it):

from alembic import op
from sqlalchemy import create_engine
from sqlalchemy.orm.session import Session


def upgrade():
    list_of_objects_to_insert = get_and_process_objects_from_csv_file()  # This returns a list of SQLAlchemy ORM objects, ready to insert.
    new_engine = create_engine(op.get_bind().engine.url, executemany_mode='values')
    session = Session(new_engine)
    session.bulk_save_objects(list_of_objects_to_insert)
    session.commit()
    print('Inserted all records from bulk list') # This never gets printed

Am I going about this the wrong way? Thanks in advance.

kenshin23
  • 129
  • 13

0 Answers0