-2

I am looking for a solution for the database of my project.

We have a database of around 70 tables and out of them there are 5-7 tables which have more than 30 million rows each.

One of these huge table is a parent table and it has NVARCHAR primary key column being referred by other tables. Scenario looks as following.

Table1 - Id NVARCHAR PK (30+ M Rows)

Table2 - Id NVARCHAR PK Referring Table1.Id (30+ M Rows)

Table3 - Id NVARCHAR PK Referring Table1.Id (30+ M Rows)

Table4 - Id NVARCHAR PK Referring Table1.Id (30+ M Rows)

Table5 - Id NVARCHAR Referring Table1.Id (30+ M Rows)

And there are some more tables referring Table1.Id probably having less rows but not less than 100,000.

The situation, for security, reasons we need to generate new values for Table1.Id and Update them.

I am thinking of making FK References Update Cascade so that when Update statement runs on Table1 values in all related tables will be updated automatically.

But I am not sure what will be the performance impact. If it slows down the database and eventually the application in production it will not be acceptable at all.

So I need a guidance here about the performance impact of update cascade in such huge tables also it will be great if someone can suggest a better way to achieve this.

Nizam
  • 4,569
  • 3
  • 43
  • 60
Chetan
  • 6,711
  • 3
  • 22
  • 32

1 Answers1

0

The first thing you should probably do is get the query plan for a cascading update. See this question that illustrates this. as this should give you a good ball-park estimate of the complexity of cascading updates.

Of course a cascading update will necessarily be slower than an non-cascading update. But, will it be slower the the equivalent code you would have to write to preserve referential integrity -- probably not unless you have some higher lever understanding of the data than the database could be expected to have.

Ex: If tables1-n were part of an olap database, and you can extract the relationships between the primary and related tables such that you can perform update statements with a high level of locality of reference by loading the updates into Table1, then the updates for Table2, then Table3, etc. esp. If the record order in your updates matches the clustered key order on each table.

It sounds like you may be doing bulk updates to Table1, in which case again, you might be able to have a faster strategy than cascading updates by taking advantage of a similar update path.

Performance is very strongly environmental, and you can't really reproduce a performance review with duplicating enough of the environment to make instrumenting a process useful. Set requirements for how fast updates must occur and measure your environment to test compliance.

Community
  • 1
  • 1
Gary Walker
  • 8,831
  • 3
  • 19
  • 41