18

I want to change a primary key and all table rows which reference to this value.

# table master
master_id|name
===============
foo|bar

# table detail
detail_id|master_id|name
========================
1234|foo|blu

If I give a script or function

 table=master, value-old=foo, value-new=abc

I want to create a SQL snippet that executes updates on all tables which refere to table "master":

update detail set master_id=value-new where master_id=value-new;
.....

With the help of introspection, this should be possible.

I use postgres.

Update

The problem is, that there are many tables which have a foreign-key to the table "master". I want a way to automatically update all tables which have a foreign-key to master table.

guettli
  • 25,042
  • 81
  • 346
  • 663
  • 1
    Why do you want to do this? As a general principle the value of an `id` should "mean" nothing at all. It should simply be a unique identifier. Needing to alter a unique identifier after it has been allocated is an indication that you're working in a pattern contrary to general SQL / Relational Database practices *(a code-smell)*. Understanding the underlying ***functional*** requirement may uncover a more appropriate ***technical*** solution. – MatBailie Mar 14 '18 at 17:05

4 Answers4

25

The easiest way to deal with primary key changes - by far - is to ALTER your referring foreign key constraints to be ON UPDATE CASCADE.

You are then free to update the primary key values, and the changes will cascade to child tables. It can be a very slow process due to all the random I/O, but it will work.

You do need to watch out not to violate uniqueness constraints on the primary key column during the process.

A fiddlier but faster way is to add a new UNIQUE column for the new PK, populate it, add new columns to all the referring tables that point to the new PK, drop the old FK constraints and columns, then finally drop the old PK.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • No, sorry. this would work. But I can't alter the table schema (in my environment). I can only alter the table data .... – guettli Aug 16 '13 at 11:09
  • 6
    @guettli Well, that's the kind of thing you want to mention to start with. Those rules are also, frankly, stupid; they draw an artificial and foolish distinction between "schema" and "data" where in fact it's blurrier than that. Are values in a static look-up table "schema" or "data"? Nonetheless, if that's the restriction you're stuck with, um, good luck, and get friendly with the `information_schema` documentation. – Craig Ringer Aug 16 '13 at 11:14
  • 1
    This answer contains text for a human. I am searching for something which can get executed by a computer. – guettli Mar 13 '18 at 12:02
13

If you need to change PK you could use DEFFERED CONSTRAINTS:

SET CONSTRAINTS sets the behavior of constraint checking within the current transaction. IMMEDIATE constraints are checked at the end of each statement. DEFERRED constraints are not checked until transaction commit. Each constraint has its own IMMEDIATE or DEFERRED mode.

Data preparation:

CREATE TABLE master(master_id VARCHAR(10) PRIMARY KEY, name VARCHAR(10));
INSERT INTO master(master_id, name) VALUES ('foo', 'bar');

CREATE TABLE detail(detail_id INT PRIMARY KEY, master_id VARCHAR(10)
   ,name VARCHAR(10)
   ,CONSTRAINT  fk_det_mas FOREIGN KEY (master_id) REFERENCES master(master_id));

INSERT INTO detail(detail_id, master_id, name) VALUES (1234,'foo','blu');

In normal situtation if you try to change master detail you will end up with error:

update detail set master_id='foo2' where master_id='foo';
-- ERROR:  insert or update on table "detail" violates foreign key 
-- constraint "fk_det_mas"
-- DETAIL:  Key (master_id)=(foo2) is not present in table "master"

update master set master_id='foo2' where master_id='foo';
-- ERROR:  update or delete on table "master" violates foreign key
-- constraint "fk_det_mas" on table "detail"
-- DETAIL:  Key (master_id)=(foo) is still referenced from table "detail".

But if you change FK resolution to deffered, there is no problem:

ALTER TABLE detail DROP CONSTRAINT fk_det_mas ;
ALTER TABLE detail ADD CONSTRAINT fk_det_mas FOREIGN KEY (master_id) 
REFERENCES master(master_id) DEFERRABLE;

BEGIN TRANSACTION;
SET CONSTRAINTS ALL DEFERRED;
UPDATE master set master_id='foo2' where master_id = 'foo';
UPDATE detail set master_id='foo2' where master_id = 'foo';
COMMIT;

DBFiddle Demo

Please note that you could do many things inside transaction, but during COMMIT all referential integrity checks have to hold.

EDIT

If you want to automate this process you could use dynamic SQL and metadata tables. Here Proof of Concept for one FK column:

CREATE TABLE master(master_id VARCHAR(10) PRIMARY KEY, name VARCHAR(10));
INSERT INTO master(master_id, name)
VALUES ('foo', 'bar');

CREATE TABLE detail(detail_id INT PRIMARY KEY, master_id VARCHAR(10),
   name VARCHAR(10)
  ,CONSTRAINT  fk_det_mas FOREIGN KEY (master_id) 
   REFERENCES master(master_id)DEFERRABLE ) ;
INSERT INTO detail(detail_id, master_id, name) VALUES (1234,'foo','blu');

CREATE TABLE detail_second(detail_id INT PRIMARY KEY, name VARCHAR(10),
   master_id_second_name VARCHAR(10)
  ,CONSTRAINT  fk_det_mas_2 FOREIGN KEY (master_id_second_name) 
   REFERENCES master(master_id)DEFERRABLE ) ;
INSERT INTO detail_second(detail_id, master_id_second_name, name) 
VALUES (1234,'foo','blu');

And code:

BEGIN TRANSACTION;
SET CONSTRAINTS ALL DEFERRED;
DO $$
DECLARE
   old_pk TEXT = 'foo';
   new_pk TEXT = 'foo2';
   table_name TEXT = 'master';
BEGIN
-- update childs
EXECUTE (select 
         string_agg(FORMAT('UPDATE %s SET %s = ''%s'' WHERE %s =''%s'' ;'
            ,c.relname,pa.attname, new_pk,pa.attname, old_pk),CHR(13)) AS sql
         from  pg_constraint pc
         join pg_class c on pc.conrelid = c.oid
         join pg_attribute pa ON pc.conkey[1] = pa.attnum 
          and pa.attrelid = pc.conrelid
         join pg_attribute pa2 ON pc.confkey[1] = pa2.attnum 
          and pa2.attrelid = table_name::regclass
         where pc.contype = 'f');

-- update parent        
EXECUTE ( SELECT FORMAT('UPDATE %s SET %s = ''%s'' WHERE %s =''%s'';'
         ,c.relname,pa.attname, new_pk,pa.attname, old_pk)
 FROM pg_constraint pc
 join pg_class c on pc.conrelid = c.oid
 join pg_attribute pa ON pc.conkey[1] = pa.attnum 
  and pa.attrelid = pc.conrelid
 WHERE pc.contype IN ('p','u')
   AND conrelid = table_name::regclass
);       
         
END
$$;
COMMIT;

DBFiddle Demo 2

EDIT 2:

I tried it, but it does not work. It would be nice, if the script could show the SQL. This is enough. After looking at the generated SQL I can execute it if psql -f

have you tried it? It did not work for me.

Yes, I have tried it. Just check above live demo links. I prepare the same demo with more debug info:

  • values before
  • executed SQL
  • values after

Please make sure that FKs are defined as DEFFERED.

DBFiddle 2 with debug info

LAST EDIT

Then I wanted to see the sql instead of executing it. I removed "perform" from your fiddle, but then I get an error. See: http://dbfiddle.uk/?rdbms=postgres_10&fiddle=b9431c8608e54b4c42b5dbd145aa1458

If you only want to get SQL code you could create function:

CREATE FUNCTION generate_update_sql(table_name VARCHAR(100), old_pk VARCHAR(100), new_pk VARCHAR(100))
RETURNS TEXT 
AS 
$$
BEGIN
RETURN 
-- update childs
(SELECT 
         string_agg(FORMAT('UPDATE %s SET %s = ''%s'' WHERE %s =''%s'' ;',  c.relname,pa.attname, new_pk,pa.attname, old_pk),CHR(13)) AS sql
         FROM  pg_constraint pc
         JOIN pg_class c on pc.conrelid = c.oid
         JOIN pg_attribute pa ON pc.conkey[1] = pa.attnum and pa.attrelid = pc.conrelid
         JOIN pg_attribute pa2 ON pc.confkey[1] = pa2.attnum and pa2.attrelid = table_name::regclass
         WHERE pc.contype = 'f') || CHR(13) ||
-- update parent        
(SELECT FORMAT('UPDATE %s SET %s = ''%s'' WHERE %s =''%s'';',  c.relname,pa.attname, new_pk,pa.attname, old_pk)
 FROM pg_constraint pc
 JOIN pg_class c on pc.conrelid = c.oid
 JOIN pg_attribute pa ON pc.conkey[1] = pa.attnum and pa.attrelid = pc.conrelid
 WHERE pc.contype IN ('p','u')
   AND conrelid = table_name::regclass)
;       
END
$$ LANGUAGE  plpgsql;

And execution:

SELECT generate_update_sql('master', 'foo', 'foo');

UPDATE detail SET master_id = 'foo' WHERE master_id ='foo' ;
UPDATE detail_second SET master_id_second_name = 'foo' 
 WHERE master_id_second_name ='foo' ; 
UPDATE master SET master_id = 'foo' WHERE master_id ='foo';

DBFiddle Function Demo

Of course there is a place for improvement for example handling identifiers like "table with space in name" and so on.

Community
  • 1
  • 1
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • I think your answer is correct, but it does not match to the question. I updated the question to underline what I want. – guettli Mar 14 '18 at 09:05
  • @guettli Ok, I see your point. I updated my answer. This is Proof of Concept but I hope you will like the idea. You could enclose it with stored procedure/function and pass pk_value as parameter instead of hardcoded values. – Lukasz Szozda Mar 14 '18 at 16:19
  • I tried it, but it does not work. It would be nice, if the script could show the SQL. This is enough. After looking at the generated SQL I can execute it if `psql -f`. – guettli Mar 15 '18 at 07:56
  • have you tried it? It did not work for me. I get: `ERROR: value too long for type character varying(3)`. And it looks like there are inserts into wrong columns. – guettli Mar 15 '18 at 08:01
  • @guettli `"have you tried it?"` Have you checked my demo link [DBFiddle Demo 2](http://dbfiddle.uk/?rdbms=postgres_10&fiddle=43dc445f7f9f6cbfd8408a2378f72332)? As I said it is PoC but it is working, otherwise I wouldn't posted it. I suggest to start playing with code. – Lukasz Szozda Mar 15 '18 at 15:13
  • Of course I tried it. I tried it on my schema and it failed. Then I wanted to see the sql instead of executing it. I removed "perform" from your fiddle, but then I get an error. See: http://dbfiddle.uk/?rdbms=postgres_10&fiddle=b9431c8608e54b4c42b5dbd145aa1458 – guettli Mar 16 '18 at 09:35
  • I give you the bounty, because time has elapsed. If I could see the sql (instead of executing it), then I can debug why it does not work on my schema. – guettli Mar 16 '18 at 09:37
  • @LukaszSzozda can we call generate_update_sql in an sql update request ? – m3asmi Jun 07 '20 at 16:01
  • 1
    @m3asmi I am not sure if I get your question. The function generate a SQL query to be executed. You could use the second approach with `EXECUTE (...)` – Lukasz Szozda Jun 08 '20 at 19:19
3

I found a dirty solution: in psql the command \d master_table show the relevant information. With some text magic, it is possible to extract the needed information:

echo "UPDATE master_table SET id='NEW' WHERE id='OLD';" > tmp/foreign-keys.txt

psql -c '\d master_table' | grep -P 'TABLE.*CONSTRAINT.*FOREIGN KEY'  \
                                 >> tmp/foreign-keys.txt

reprec '.*TABLE ("[^"]*") CONSTRAINT[^(]*\(([^)]*)\).*' \
        "UPDATE \1 set \2='NEW' WHERE \2='OLD';" \
         tmp/foreign-keys.txt 

psql -1 -f tmp/foreign-keys.txt 

Result:

UPDATE "master_table" SET id='NEW' WHERE id='OLD';
UPDATE "other_table" SET master_id='NEW' WHERE master_id='OLD';
...

But better solutions are welcome.

guettli
  • 25,042
  • 81
  • 346
  • 663
  • You can get foreign key from `pg_constraint`, [`link`](http://www.postgresql.org/docs/current/static/catalog-pg-constraint.html) and then create some dynamic SQL. But it is still a dirty hack. – Ihor Romanchenko Aug 16 '13 at 12:57
2

I dont think you can update the Primary key. One possible work around is that you can remove the primary key constraint from the table column. Then update the column value.

Updating the primary key can lead you to some serious problems. But if you still want to do it.

Please refer this Thread.(kevchadders has given a solution.)

Community
  • 1
  • 1
Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
  • I can update the primary key. I need to update the master table and all tables with a foreign key in one transaction. This works. But I want to automate this. I want to be sure that, I don't miss a table. – guettli Aug 16 '13 at 11:07
  • 4
    @guettli If you miss a table the update fails with an integrity error, rolling back the transaction, so you try again. Where's the problem? If you're doing this routinely then your database design is *insane* - the FKs should be `ON UPDATE CASCADE` if the PK can change, and if the PK changes routinely then your DB design probably uses the wrong thing as a PK. – Craig Ringer Aug 16 '13 at 11:15
  • My development environment does not have all tables filled like the production server. The migration could work in dev, but could fail in prod.... – guettli Aug 16 '13 at 11:50
  • 2
    @guettli If your dev and prod are different and something can "work in dev, but could fail in prod" - you are doing thing very, **very** wrong. – Ihor Romanchenko Aug 16 '13 at 12:54
  • @IgorRomanchenko Most dev systems only have a subset of the data of the prod system. That's why I don't want to guess all tables which refer to the master table. – guettli Aug 20 '13 at 12:26