How to use identity PK column on multi table inheritance (table-per-type inheritance)

Question

I'd like to have a database schema with a parent table and a few derived tables, such as:

CREATE TABLE Base(
    [Id] [int] Identity(1,1) primary key clustered,
    [field1] [varchar](80) NOT NULL,
    [field2] [datetime] NOT NULL,
    --...
)

CREATE TABLE DerivedA(
    [Id] [int] primary key clustered,
    [field1] [varchar](10) NOT NULL,
    --...
)

CREATE TABLE DerivedB(
    [Id] [int] primary key clustered,
    [field1] [varchar](50) NOT NULL,
    [field1] [varchar](50) NOT NULL,
    --...
)

CREATE TABLE DerivedC(
    [Id] [int] primary key clustered,
    [field1] [varbinary](100) NOT NULL,
    --...
)

-- and so on...

Naturally I'd want my derived tables to have a foreign key constraint on Id referencing the Id on the base table, but I'm unsure on how to do so when the base table PK is an identity field though, as the Id is generated when the row is inserted.

Why would it matter that the base table's `ID` is an identity? — Tab Alleman, Jun 11 '18 at 18:22
because it ensures the values will be unique an ever increasing, which as far as I know is what you want from a clustered index. I guess I could query the Db to find the current max Id value and fill the Ids of the rows being inserted myself, but that would cause issues if there are multiple sources inserting in the database — Janilson, Jun 11 '18 at 18:28
You have a terminology problem here. In SQL and relational terminology the term "*Derived Table*" does *not* refer to anything having to do with inheritance,rather it refers to an intermediate query expression such as a View, subquery or CTE (Common Table Expression). In other words, a Derived Table is *not* a Table, but appears as one for the purposes of a query, but it is not an actual storage object in the database. — RBarryYoung, Jun 11 '18 at 18:34
@RBarryYoung What would be the correct term in SQL terminology, so I can fix my post then? Inherited table? Child table? — Janilson, Jun 11 '18 at 18:39
SQL and relational theory doesn't really have inheritance as a built-in concept, so there's no real formal term for it. It's a *type* of child-relation, but parent-child relations are usually one-to-many and these are one-to-(zero or one), so calling it a child table would be a little misleading (though not incorrect). I personally call them "*extension* tables". but that's just my term. You could call them "RelatedTable1,.. etc." which covers pretty much anything. — RBarryYoung, Jun 11 '18 at 18:46
How you will insert the data depends on what form that data comes in, which you haven't told us. — Tab Alleman, Jun 11 '18 at 18:47
See this answer: https://stackoverflow.com/questions/5558582/sql-server-output-clause-into-a-scalar-variable — RBarryYoung, Jun 11 '18 at 19:15
@RBarryYoung I've never used the output clause before but it looks promising, I'll run some tests with it later, thanks. You should post it as an answer so I can accept it. — Janilson, Jun 11 '18 at 19:53

score 2 · Accepted Answer · answered Jun 11 '18 at 20:54

The right way to do this depends a lot on the specific details of how you are batch inputting your data and then how you are going to add the data in the related tables, but usually, some form of the OUTPUT clause can do the job as demonstrated below:

-- base table to store the data
CREATE TABLE Base(
    ID      int Identity(1,1) primary key clustered,
    ExtGuid UNIQUEIDENTIFIER,
    Other1  varchar(99)
);
go


-- temp table to represent the input data/stream
Create Table #tmp(ExtGuid UNIQUEIDENTIFIER, OtherData varchar(99));
GO

-- table variable to hold the output
Declare @CreatedIds table (ID int, ExtGuid UNIQUEIDENTIFIER)

-- insert the input data/stream and capture the IDs created
INSERT INTO Base(ExtGuid, Other1)
OUTPUT inserted.ID, inserted.ExtGuid INTO @CreatedIds
SELECT t.ExtGuid, t.OtherData 
FROM #tmp t
;

-- one way to return the results
Select * from @CreatedIds

score 1 · Answer 2 · answered Jun 11 '18 at 18:22

1

When you insert the parent row, SQL Server gives you the value of the Identity PK Identity just inserted using:

select scope_identity()

You can use this value to insert the related rows.

answered Jun 11 '18 at 18:22

The Impaler

45,731
9
39
76

How would I use this when inserting many rows in a batch (as I said in the post, the tables are likely to be insert intensive)? Could I run into issues if a different source inserted a row between my insert and the `select scope_identity()` – Janilson Jun 11 '18 at 18:31
1

@Lucas: it depends on how you are "*inserting in a batch*". Show us your code for this and we can better answer that. – RBarryYoung Jun 11 '18 at 18:50
@RBarryYoung The data comes from an outside source that uses non sequential Guids as an Id, which I've been using as a nonclustered PK, and inserting either with a stored procedure with TVP input or inserting into a temp table then using merge. As I receive the data in batches of hundreds to thousands of rows, I insert them in batches of hundred to thousands of rows as well. I was considering switching to an int clustered PK to avoid RID lookups on my select queries. – Janilson Jun 11 '18 at 19:08
@Lucas then you probably need an OUTPUT clause to another temp table. – RBarryYoung Jun 11 '18 at 19:11

How to use identity PK column on multi table inheritance (table-per-type inheritance)

2 Answers2