4

I've the following table:

DROP SEQUENCE IF EXISTS CATEGORY_SEQ CASCADE;
CREATE SEQUENCE CATEGORY_SEQ START 1;

DROP TABLE IF EXISTS CATEGORY CASCADE;

CREATE TABLE CATEGORY (
  ID        BIGINT                 NOT NULL DEFAULT nextval('CATEGORY_SEQ'),
  NAME      CHARACTER VARYING(255) NOT NULL,
  PARENT_ID BIGINT
);

ALTER TABLE CATEGORY
  ADD CONSTRAINT CATEGORY_PK PRIMARY KEY (ID);
ALTER TABLE CATEGORY
  ADD CONSTRAINT CATEGORY_SELF_FK FOREIGN KEY (PARENT_ID) REFERENCES CATEGORY (ID);

Now I need to insert the data. So I start with parent:

INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1');

And now I need the ID of the just inserted parent to add children to it:

INSERT INTO CATEGORY (NAME, PARENT_ID) VALUES ('CHILDREN_1_1', <what_goes_here>);
INSERT INTO CATEGORY (NAME, PARENT_ID) VALUES ('CHILDREN_1_2', <what_goes_here>);

How can I get and store the ID of the parent to later use it in the subsequent inserts?

Opal
  • 81,889
  • 28
  • 189
  • 210
  • @a_horse_with_no_name, it won't work this way. For both children I need the same value of `PARENT_ID`, both functions will give me the latest. – Opal Aug 23 '16 at 10:59

3 Answers3

5

You can use a data modifying CTE with the returning clause:

with parent_cat (parent_id) as (
   INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1')
   returning id
)
INSERT INTO CATEGORY (NAME, PARENT_ID) 
VALUES 
  ('CHILDREN_1_1', (select parent_id from parent_cat) ), 
  ('CHILDREN_1_2', (select parent_id from parent_cat) );
3

The answer is to use RETURNING along with WITH

WITH inserted AS (
  INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1')
  RETURNING id
) INSERT INTO CATEGORY (NAME, PARENT_ID) VALUES
  ('CHILD_1_1', (SELECT inserted.id FROM inserted)),
  ('CHILD_2_1', (SELECT inserted.id FROM inserted));
Opal
  • 81,889
  • 28
  • 189
  • 210
-1

( tl;dr : goto option 3: INSERT with RETURNING )

Recall that in postgresql there is no "id" concept for tables, just sequences (which are typically but not necessarily used as default values for surrogate primary keys, with the SERIAL pseudo-type).

If you are interested in getting the id of a newly inserted row, there are several ways:


Option 1: CURRVAL(<sequence name>);.

For example:

  INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John');
  SELECT currval('persons_id_seq');

The name of the sequence must be known, it's really arbitrary; in this example we assume that the table persons has an id column created with the SERIAL pseudo-type. To avoid relying on this and to feel more clean, you can use instead pg_get_serial_sequence:

  INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John');
  SELECT currval(pg_get_serial_sequence('persons','id'));

Caveat: currval() only works after an INSERT (which has executed nextval() ), in the same session.


Option 2: LASTVAL();

This is similar to the previous, only that you don't need to specify the sequence number: it looks for the most recent modified sequence (always inside your session, same caveat as above).


Both CURRVAL and LASTVAL are totally concurrent safe. The behaviour of sequence in PG is designed so that different session will not interfere, so there is no risk of race conditions (if another session inserts another row between my INSERT and my SELECT, I still get my correct value).

However they do have a subtle potential problem. If the database has some TRIGGER (or RULE) that, on insertion into persons table, makes some extra insertions in other tables... then LASTVAL will probably give us the wrong value. The problem can even happen with CURRVAL, if the extra insertions are done intto the same persons table (this is much less usual, but the risk still exists).


Option 3: INSERT with RETURNING

INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John') RETURNING id;

This is the most clean, efficient and safe way to get the id. It doesn't have any of the risks of the previous.

Drawbacks? Almost none: you might need to modify the way you call your INSERT statement (in the worst case, perhaps your API or DB layer does not expect an INSERT to return a value); it's not standard SQL (who cares); it's available since Postgresql 8.2 (Dec 2006...)


Conclusion: If you can, go for option 3. Elsewhere, prefer 1.

Note: all these methods are useless if you intend to get the last globally inserted id (not necessarily in your session). For this, you must resort to select max(id) from table (of course, this will not read uncommitted inserts from other transactions).

Ahmed Jehanzaib
  • 158
  • 1
  • 10
  • Option 3 can't be used in a plain SQL script –  Aug 23 '16 at 10:51
  • How can I store ID returned in option 3 and use it in subsequent inserts? – Opal Aug 23 '16 at 10:55
  • It also seems that both `currval` and `lastval` won't work. The value of the sequence will be modified after the second first child is inserted hence the value of parent id will be lost. – Opal Aug 23 '16 at 10:58
  • This will work for you. Get an id of parent then add it in children insertion: INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1') returning id; – Ahmed Jehanzaib Aug 23 '16 at 11:07
  • @AhmedJehanzaib, yes I know that I need to grab the ID of the parent. The question is how? – Opal Aug 23 '16 at 11:17
  • @Opal Are you scripting it in python. Actually i do not know the exact scenario. If you are using python then after executing the query there is a command to get returned values cursor.fetchone(). I think it will help you. – Ahmed Jehanzaib Aug 23 '16 at 11:26
  • @AhmedJehanzaib, I script in plain SQL script. – Opal Aug 23 '16 at 11:27
  • @Opal One more question you know the parent name or not on insertion of children? – Ahmed Jehanzaib Aug 23 '16 at 11:40