1

The basic problem is that we are managing a significant amount of generated rows, and it is mission critical that this data is generated exactly once and only if necessary. Suppose you have a data relation:

CREATE TABLE sometable (
    id SERIAL,
    refID INTEGER,
    ...
);

Now, in some PL/PGSQL function we have:

...
-- Advisory locks didn't help here? :(
IF FALSE = SELECT EXISTS( SELECT 1 FROM sometable WHERE refID = dataID) THEN
    -- Generate fixed number of new rows in sometable that reference dataID.
END IF;
...

In short, the rows that should not be generated more than once some times are. As noted, advisory locks of the form PERFORM pg_advisory_lock(dataID) sadly did not help prevent this. Is there any hope?

EDIT: Forgot to mention that I ran into the duplicate data issue when testing with pgbench.

EDIT 2: Incorrect code fix, clarify issue.

gdoug
  • 715
  • 1
  • 5
  • 16
  • you will get a better response if you post the question on http://dba.stackexchange.com – Nandakumar V May 30 '15 at 05:40
  • What exactly is your problem? Can you explain what triggers the function call, why you need a lock and what it is that the lock did not help with? – Patrick May 30 '15 at 06:37
  • Also: `PERFORM * FROM sometable WHERE refID = dataID; IF found THEN ... END IF;` – Patrick May 30 '15 at 06:38
  • Thanks for note Patrick, I've updated it to more clearly state the issue. Basically the rows that should (and must) only be generated once is being generated multiple times. – gdoug May 30 '15 at 06:45

1 Answers1

1

Perhaps the simplest solution is just to have a separate processed_ids table with a unique constraint on the id in question. Your function can try to insert to that table and if there is an exception then that ID is already processed.

Richard Huxton
  • 21,516
  • 3
  • 39
  • 51
  • Yup, that worked for me. I also had to add a while loop in the exception catching code that appropriately waits for the correct number of rows to be generated by the function call that _does_ manage to insert into `processed_ids` table, which might not work for others with a similar issue if its not clear how many rows should be calculated. But if you do, then you are golden! – gdoug Jun 01 '15 at 22:26