What's the best way to update a closure table?

Question

I have a table relating records using the adjacency list method (table A), and another table relating the same records using a closure table (table B). They both capture the same graph, so they both need to be kept in sync.

The question is, what's the best way to update the closure table?

As I see it, there are three alternatives:

Triggers. On INSERT/UPDATE/DELETE at A, run a sproc that calculates the new closures. Cons: changes to A result in a long, synchronous (locking?) operation; possible deadlocks (?).
Application code. Narrow down changes in A to Add/Update/Delete methods (eg. a repository pattern), and overload them with calls to a sproc that calculates the new closures. Cons: extra round trip to the DB; possible integrity issue if another thead modifies A or B in a contrary way at the same time; possible integrity issue if, say, another application decides to modify A and not B.
Background updater. Write a secondary process that continually looks for updates to A and makes the corresponding updates to the closure table. Cons: complex (extra service to write and manage); windows without synchronization.

Even if there is no "best" option, any thoughts on the trade-offs would be most appreciated!

score 4 · Accepted Answer · answered Apr 08 '11 at 23:25

4

If your hierarchies are anything like as mostly static as most I've dealt with, I would probably go with the trigger. Really depends on the update frequency and the read load.

answered Apr 08 '11 at 23:25

Cade Roux

88,164
40
182
265

Thanks for the comment! I'd love to hear more about how update frequency and load should impact my choice, especially with regard to the first two options. FWIW, I expect table *A* to change fairly frequently: say a handful of times per second, with an even distribution on change depths (ie. some near leaves, some near roots, some in the middle). – ladenedge Apr 09 '11 at 06:35
@ladenedge That isn't very static. What kind of hierarchy is this and are most all reads through the closure table and writes to the adjacency table? – Cade Roux Apr 09 '11 at 13:24
@Cade Roux: this is a basic hierarchical filesystem. Writes to the closure table *only* happen via whatever sync method we choose, but both tables are read pretty heavily with respect to the tree. If it would help, we could potentially move tree-related reads from the adjacency table to the closure table..? – ladenedge Apr 11 '11 at 15:28
2

@ladenedge The more I'm thinking about it, I think you just need to optimize your closure editing in the trigger, because if the closure is changing, you do want readers to be blocked. If you could allow the readers to be a little out of sync, you could do snapshot builds of the entire closure and readers who started reading an old revision of the closure would not be affected while the new one is being assembled. Old revisions in the closure table could then be removed in a separate process. – Cade Roux Apr 11 '11 at 15:43
@Cade Roux: optimizing the trigger seems like a good start. In the meantime, perhaps I'll read more about snapshot updating. Thank you for your time! – ladenedge Apr 11 '11 at 16:05
2

@ladenedge Naively, like append with 'INSERT INTO Closure (RevisionId, ...)', select with 'SELECT * FROM Closure WHERE RevisionId = (SELECT MAX(RevisionId) FROM Closure)', purge hourly, say, with 'DELETE FROM Closure WHERE RevisionId < (SELECT MAX(RevisionId) FROM Closure)' – Cade Roux Apr 11 '11 at 16:46

What's the best way to update a closure table?

1 Answers1