0

I am using the below python code to update postgres DB column valuebased on Id. This loop has to run for thousands of records and it is taking longer time.

Is there a way where I can pass array of dataframe values instead of looping each row?

 for i in range(0,len(df)):
        QUERY=""" UPDATE "Table" SET "value"='%s' WHERE "Table"."id"='%s'
            """ % (df['value'][i], df['id'][i])
        cur.execute(QUERY)
        conn.commit()
Prasad
  • 175
  • 1
  • 2
  • 9

1 Answers1

0

Depends on a library you use to communicate with PostgreSQL, but usually bulk inserts are much faster via COPY FROM command.

If you use psycopg2 it is as simple as following:

cursor.copy_from(io.StringIO(string_variable), "destination_table", columns=('id', 'value'))

Where string_variable is tab and new line delimited dataset like 1\tvalue1\n2\tvalue2\n.

To achieve a performant bulk update I would do:

  1. Create a temporary table: CREATE TEMPORARY TABLE tmp_table;;

  2. Insert records with copy_from;

  3. Just update destination table with query UPDATE destination_table SET value = t.value FROM tmp_table t WHERE id = t.id or any other preferred syntax

icuken
  • 1,306
  • 9
  • 11