2

I'm currently trying to update all rows in a table, by setting the value of one column images based on the value of another column source_images in the same row.

for row in Cars.query.all():
    # Generate a list of strings
    images = [s for s in row.source_images if 'example.com/' in s]

    # Column 'images' being updated is a sqlalchemy.dialects.postgresql.ARRAY(Text())
    Cars.query.filter(Cars.id == row.id).update({'images': images}) 

db_session.commit()

Problem: This appears to be really slow, especially when applying to 100k+ rows. Is there a more efficient way of updating the rows?

Similar questions:

#1: This question involves updating all the rows by incrementing the value.

Model Class Definition: cars.py

from sqlalchemy import *
from sqlalchemy.dialects import postgresql
from ..Base import Base

class Car(Base):
    __tablename__ = 'cars'
    id = Column(Integer, primary_key=True)
    images = Column(postgresql.ARRAY(Text))
    source_images = Column(postgresql.ARRAY(Text))
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830

1 Answers1

1

You could take the operation to the DB, instead of fetching and updating each row separately:

from sqlalchemy import select, column, func

source_images = select([column('i')]).\
    select_from(func.unnest(Car.source_images).alias('i')).\
    where(column('i').contains('example.com/'))

source_images = func.array(source_images)

Car.query.update({Car.images: source_images},
                 synchronize_session=False)

The correlated subquery unnests the source images, selects those matching the criteria, and the ARRAY() constructor forms the new image array.

Alternatively you could use array_agg():

source_images = select([func.array_agg(column('i'))]).\
    select_from(func.unnest(Car.source_images).alias('i')).\
    where(column('i').contains('example.com/'))
Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
  • On this machine, using a local DB, it took ~5s to update 500,000 rows containing random source image arrays that contained "example.com/" with P >= 10%. – Ilja Everilä Oct 23 '19 at 17:52