Consistently update multiple rows in a table, isolating from concurrent interference

Question

I've got the task to synchronize two tables living in differents databases. So for every insert, update and delete that happens in the source table, these changes have to be replicated in the destination table. The destination table will be a clone of the source table. To implement this I decided to install triggers in the source table.

But I'm deeply concerned about the concurrent aspects of these updates, as multiple users are using the tables at the same time and sometimes the triggers have to update multiple rows in the destination table. From the point of view of the changes to be performed in the triggers I'm very convinced that the logic is right, but not about the isolation levels because I'm not an expert on this.

So, I'm going to show you the trigger that is reponsible for inserts and updates on the destination table and ask you to see if there's any problem regarding the concurrent aspect. But before that, let me show you the table and some use cases:

This is the source table (for simplicity assume that the destination table has the same structure):

CREATE TABLE SRC_DEPARTMENTS
(
   ID_DEPARTMENT INT NOT NULL PRIMARY KEY,
   NAME VARCHAR(80) NOT NULL,
   ID_PARENT_DEPARTMENT INT,
   HIERARCHY VARCHAR(50) NOT NULL,
   ACTIVE BOOLEAN NOT NULL DEFAULT TRUE,

   FOREIGN KEY (ID_PARENT_DEPARTMENT) REFERENCES SRC_DEPARTMENTS (ID_DEPARTMENT) ON DELETE CASCADE
);

Now suppose that I have these rows in the destination table:

ID_DEPARTMENT | ID_PARENT_DEPARTMENT | HIERARCHY
--------------+----------------------+----------
            1 |                      | 1
            2 |                    1 | 1.2
            3 |                    2 | 1.2.3
            4 |                    3 | 1.2.3.4
            5 |                      | 5
            6 |                    5 | 5.6

and I want to change the parent department of id 6 to point to id 4. After the change the rows should be:

ID_DEPARTMENT | ID_PARENT_DEPARTMENT | HIERARCHY
--------------+----------------------+----------
            1 |                      | 1
            2 |                    1 | 1.2
            3 |                    2 | 1.2.3
            4 |                    3 | 1.2.3.4
            6 |                    4 | 1.2.3.4.6
            5 |                      | 5

So, as you can see, only one row was affected by the update. Now suppose that I want change the parent id of id 1 (that is NULL) to point to id 6 (in the original set of rows). So, after the change you will have:

ID_DEPARTMENT | ID_PARENT_DEPARTMENT | HIERARCHY
--------------+----------------------+----------
            5 |                      | 5
            6 |                    5 | 5.6
            1 |                    6 | 5.6.1
            2 |                    1 | 5.6.1.2
            3 |                    2 | 5.6.1.2.3
            4 |                    3 | 5.6.1.2.3.4

So, in this case, I had to update multiple rows to correct the hierarchy.

So, I want change to multiple rows to be executed consistently and I suppose my trigger is not taking this into account. This is the trigger:

CREATE OR REPLACE FUNCTION insert_update_department() RETURNS trigger AS $$
DECLARE
    _id_parent_department INT;
    _id_parent_department_changed BOOLEAN := FALSE;
    _hierarchy VARCHAR(50);
    _current_hierarchy VARCHAR(50);
BEGIN
    IF TG_OP = 'UPDATE' AND (
        NEW.NAME IS NOT DISTINCT FROM OLD.NAME AND
        NEW.ID_PARENT_DEPARTMENT IS NOT DISTINCT FROM OLD.ID_PARENT_DEPARTMENT AND
        NEW.ACTIVE IS NOT DISTINCT FROM OLD.ACTIVE) THEN
        RETURN NULL;
    END IF;

    IF TG_OP = 'INSERT' OR NEW.ID_PARENT_DEPARTMENT IS DISTINCT FROM OLD.ID_PARENT_DEPARTMENT THEN
        IF NEW.ID_PARENT_DEPARTMENT IS NULL OR NEW.ID_PARENT_DEPARTMENT = NEW.ID_PARENT_DEPARTMENT THEN
            _id_parent_department := NULL;
        ELSE
            _id_parent_department := NEW.ID_PARENT_DEPARTMENT;
        END IF;

        IF _id_parent_department IS NULL THEN
            _hierarchy := '';
        ELSE
            SELECT HIERARCHY || '.'
            INTO _hierarchy
            FROM DST_DEPARTMENTS
            WHERE ID_DEPARTMENT = _id_parent_department;
        END IF;
        _hierarchy := _hierarchy || cast(NEW.ID_DEPARTMENT AS TEXT);

        IF TG_OP = 'UPDATE' THEN
            SELECT HIERARCHY || '.'
            INTO _current_hierarchy
            FROM DST_DEPARTMENTS
            WHERE ID_DEPARTMENT = NEW.ID_DEPARTMENT;

            UPDATE DST_DEPARTMENTS SET
                HIERARCHY = _hierarchy || '.' || substr(HIERARCHY, length(_current_hierarchy) + 1)
            WHERE HIERARCHY LIKE _current_hierarchy || '%';
        END IF;

        _id_parent_department_changed := TRUE;
    END IF;

    IF TG_OP = 'INSERT' THEN
        INSERT INTO DST_DEPARTMENTS VALUES (
            NEW.ID_DEPARTMENT,
            _name,
            _id_parent_department,
            _hierarchy,
            NEW.ACTIVE
        );
    ELSE
        UPDATE DST_DEPARTMENTS SET
            NAME = _name,
            ID_PARENT_DEPARTMENT = CASE WHEN _id_parent_department_changed THEN _id_parent_department ELSE ID_PARENT_DEPARTMENT END,
            HIERARCHY = CASE WHEN _id_parent_department_changed THEN _hierarchy ELSE HIERARCHY END,
            ACTIVE = NEW.ACTIVE
        WHERE ID_DEPARTMENT = NEW.ID_DEPARTMENT;
    END IF;

    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER z_insert_update_department
    AFTER INSERT OR UPDATE ON SRC_DEPARTMENTS
    FOR EACH ROW
    EXECUTE PROCEDURE insert_update_department();

Maybe changing these lines from this:

SELECT HIERARCHY || '.'
INTO _current_hierarchy
FROM DST_DEPARTMENTS
WHERE ID_DEPARTMENT = NEW.ID_DEPARTMENT;

to this:

SELECT HIERARCHY || '.'
INTO _current_hierarchy
FROM DST_DEPARTMENTS
WHERE ID_DEPARTMENT = NEW.ID_DEPARTMENT
FOR UPDATE;

will solve the problem for the current row but not for the other rows that need to be updated.

I'll very glad if someone tells me what's the right thing to do to correct the trigger to work correctly concurrently.

Thank you in advance.

Marcos

`two tables living in differents databases.` Different databases being part of **one** transaction? You'll at least need two phase commit (which actually are two synchronised transactions) — joop, Mar 24 '15 at 10:09
@joop Using _postgres_fdw_ [link] (http://www.postgresql.org/docs/9.3/static/postgres-fdw.html) you can make two tables living in different databases share the same transaction. That's what I'm using. The trigger is already running under a transaction. — Marcos, Mar 24 '15 at 12:45
@joop More specifically see the section _F.31.3. Transaction Management_ in (http://www.postgresql.org/docs/9.3/static/postgres-fdw.html). — Marcos, Mar 24 '15 at 13:18

score 0 · Answer 1 · answered Mar 24 '15 at 09:36

0

Maybe using transactions may be helpful... as far as I know, they block all modifications on the database until all actions are completed, so they commit the changes at once avoiding inconsistent updates. See transactions, it could be what you're looking for

answered Mar 24 '15 at 09:36

Dani

161
4

The trigger is already running under a transaction started by the application code, but transactions won't isolate you from problems that arise from concurrent issues, like _dirty read_, _nonrepeatable read_, etc. (http://www.postgresql.org/docs/9.3/static/transaction-iso.html). You have to code your queries and updates properly to avoid these issues. That's what I'm seeking here. If someone can point any of these flaws in the trigger above I would be very grateful. – Marcos Mar 24 '15 at 12:50

Consistently update multiple rows in a table, isolating from concurrent interference

1 Answers1