1

I am following the technique described in Jeff Smith's "Implementing Table Inheritance in SQL Server" (which seems to be the de facto approach for implementing this kind of structure). The People base table has a 1 : 0..1 relationship with its three subtype tables Students, Teachers, Parents. This is traditionally done by defining the primary key of the subtype tables as a foreign key to the base table.

To enforce exclusivity among the subtypes (preventing the same person from being simultaneously a student and a teacher), the author recommends adding PersonTypeID as a persisted computed column to each of the subtype tables, and including it in the foreign key constraint with the base table.

CREATE TABLE PersonType 
(
    PersonTypeID INT PRIMARY KEY,
    Description  VARCHAR(10)
);

INSERT INTO PersonType 
VALUES (1, 'Student'),
       (2, 'Teacher'),
       (3, 'Parent');

CREATE TABLE People 
( 
    PersonID     INT PRIMARY KEY,
    PersonTypeID INT REFERENCES PersonType (PersonTypeID), 
    Name         VARCHAR(10),
    UNIQUE (PersonID, PersonTypeID)
)

CREATE TABLE Students 
( 
    PersonID INT PRIMARY KEY, 
    PersonTypeID AS 1 PERSISTED, -- student 
    EnrollmentDate DATETIME, 
    FOREIGN KEY (PersonID, PersonTypeID) REFERENCES People (PersonID, PersonTypeID) 
) 

CREATE TABLE Teachers 
( 
    PersonID INT PRIMARY KEY, 
    PersonTypeID AS 2 PERSISTED, -- teacher 
    HireDate DATETIME, 
    FOREIGN KEY (PersonID, PersonTypeID) REFERENCES People (PersonID, PersonTypeID) 
) 

CREATE TABLE Parents 
( 
    PersonID INT PRIMARY KEY, 
    PersonTypeID AS 3 PERSISTED, -- parents 
    DifficultyScore INT, 
    FOREIGN KEY (PersonID, PersonTypeID) REFERENCES People (PersonID, PersonTypeID) 
) 

However, this approach suffers from a number of issues:

  1. It wastes an additional column of space on each of the subtype tables.
  2. It requires an additional unique constraint on the base table. This wastes more space (since it will be implemented as a unique index) and slows down updates to the base table.
  3. The foreign key constraint involves a check on two columns (instead of one), slowing down updates to the subtype tables.

My hypothesis is that it would be better to enforce uniqueness using check constraints through a scalar function instead. This would eliminate the wasted storage for the extra column and unique index, speed up updates to the base table, and hopefully achieve the same performance for updates to the subtype tables as the composite foreign key would.

CREATE TABLE People 
( 
    PersonID     INT PRIMARY KEY,
    PersonTypeID INT REFERENCES PersonType (PersonTypeID), 
    Name         VARCHAR(10)
)

CREATE FUNCTION GetPersonTypeID (@PersonID INT)
RETURNS INT
AS
BEGIN
    RETURN 
    (
        SELECT PersonTypeID 
        FROM People
        WHERE PersonID = @PersonID
    )
END;

CREATE TABLE Students 
( 
    PersonID INT PRIMARY KEY REFERENCES People (PersonID) 
        CHECK (dbo.GetPersonTypeID(PersonID) = 1),
    EnrollmentDate DATETIME
) 

CREATE TABLE Teachers 
( 
    PersonID INT PRIMARY KEY REFERENCES People (PersonID)
        CHECK (dbo.GetPersonTypeID(PersonID) = 2),
    HireDate DATETIME
) 

CREATE TABLE Parents 
( 
    PersonID INT PRIMARY KEY REFERENCES People (PersonID)
        CHECK (dbo.GetPersonTypeID(PersonID) = 3),
    DifficultyScore INT
) 

Is there any reason why this approach should not be used?

Douglas
  • 53,759
  • 13
  • 140
  • 188
  • *"I feel that this approach suffers from a number of issues"* How do you propose to test your feelings? – Mike Sherrill 'Cat Recall' Jun 03 '14 at 21:35
  • @MikeSherrill'CatRecall': Wording fixed. – Douglas Jun 03 '14 at 21:37
  • Ok. How much waste in performance and disk space did you measure? – Mike Sherrill 'Cat Recall' Jun 03 '14 at 21:45
  • Person type can feasibly be a tinyint for 3 person types, so you have 1 extra byte per person. Even assuming you have the biggest school in the world I don't think the extra storage is going to a problem. On top of that using scalar valued function is hugely inefficient compared to a foreign key check, so all your writes are going to impact performance way more than an extra byte (or even 4 for an int) per row. – GarethD Jun 03 '14 at 21:50
  • @GarethD: Thanks, I suspected that scalar functions' poor performance might have been the reason behind their avoidance. – Douglas Jun 03 '14 at 21:53
  • http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/06/25/scalar-udfs-wrapped-in-check-constraints-are-very-slow-and-may-fail-for-multirow-updates.aspx This should provide a better answer than I could here. – GarethD Jun 03 '14 at 21:53
  • @GarethD: That's just what I was looking for. Please post your comment and link as an answer, and I'll accept it. – Douglas Jun 03 '14 at 21:55

1 Answers1

1

The additional storage is actually minimal, if your person type column is a tinyint (so up to 255 types of people) you are still only using a single byte extra per person. So this should not be a massive factor in the decision, the main problem is that scalar udfs perform significantly worse than foreign key constraints. This has been tested and the results shown in in the article Scalar UDFs wrapped in CHECK constraints are very slow and may fail for multirow updates .

The testing is also included in this SO answer

Community
  • 1
  • 1
GarethD
  • 68,045
  • 10
  • 83
  • 123
  • Although the answer linked actually answers this question, the questions are different so I don't think this qualifies as a duplicate, I could be wrong, but a single vote to close from me will close the question and I am not confident enough that it is a duplicate to use my single close vote superpower in this case. – GarethD Jun 03 '14 at 22:05
  • They're related, but I wouldn't say it's a duplicate; the other question asks "How?", whilst mine compares the techniques and asks "Why?". Out of curiosity, how did you get the single-vote close privilege? – Douglas Jun 03 '14 at 22:15
  • [Users with a gold badge in a particular tag can mark a question as a duplicate with a single vote](http://meta.stackoverflow.com/a/254590/1048425), since this was tagged with sql, it would only take a single vote from me to mark it as a duplicate. – GarethD Jun 03 '14 at 22:18
  • Thanks again, I wasn't aware of that privilege for gold badges. Cheers! – Douglas Jun 03 '14 at 22:25