I'm running a migration which inserts ~160k records into a PostgreSQL database. This is taking between 40-50 seconds to complete. Unfortunately, this breaks the build process in CircleCI and therefore makes every build after it fail. I also have no direct access to the database and am forced to do this in a migration.
Package versions, if relevant:
- Python: 3.7.2
- Flask: 1.1.2,
- Flask-SQLAlchemy: 2.5.1
- psycopg2-binary: 2.8.6
- SQLAlchemy: 1.3.23
I've been reading about the executemany='values'
flag on create_engine
(engine = create_engine("postgresql+psycopg2://scott:tiger@host/dbname", xecutemany_mode='values_plus_batch')
, for example) and I'd like to use it for this migration, since it seems it would help, but if I do so, the process simply hangs (inspecting PostgreSQL's process list hints at the session being locked on the first insert
statement, but I have no idea how to close it):
from alembic import op
from sqlalchemy import create_engine
from sqlalchemy.orm.session import Session
def upgrade():
list_of_objects_to_insert = get_and_process_objects_from_csv_file() # This returns a list of SQLAlchemy ORM objects, ready to insert.
new_engine = create_engine(op.get_bind().engine.url, executemany_mode='values')
session = Session(new_engine)
session.bulk_save_objects(list_of_objects_to_insert)
session.commit()
print('Inserted all records from bulk list') # This never gets printed
Am I going about this the wrong way? Thanks in advance.