1

I have inherited a system with 3 databases running on a single server (SQL Server 2012):

  • StagingDB (250GB)
  • ReportingDB (350GB)
  • AppDB (500GB).

Every day the following process happens:

  1. Several GB of plain text CSV data is deposited on the server by a 3rd party process.
  2. SQL Server Agent picks up the data and imports it into StagingDB
  3. SQL Server Agent runs several jobs that rebuild ReportingDB from scratch based on the current contents of StagingDB. This build takes 5-6 hours.
  4. SQL Server Agent runs Aggregate summary KPIs off of ReportingDB and stores them in AppDB. This process takes 1-2 hours.

PROBLEMS

  1. Availability: While (3) above is running, we have to close a section of the application from end-user access as this section displays grids with Activity Data from ReportingDB. While (4) is running we have to close the application completely.
  2. Performance: As all the DBs are on a single server, performance of any stored procedure operating on ReportingDB or AppDB while the build process runs is a terrible experience for end users.

OBJECTIVES

We're exploring solutions which can solve this problem for us. i.e. We want to:

  1. Serve previously built data continuously until ALL of the latest data is available.
  2. Assure performance during build time by separating the build onto another physical server.

APPROACH

We're exploring these potential solutions:

  1. Use Merge Replication. Build all data on the Publisher and then sync it to the Subscriber by invoking the Merge Agent using on-demand replication.
  2. Build all data on a Build server instance and then manually copy the tables across to the Production server instance.
  3. Don't copy any of the daily build data at all. Have 2 SQL server instances with a flag set in a table which specifies the active database. When daily build completes on a server, copy only the user transactional data across to it from the active instance and then set the newly built instance to active and the other to inactive. The applications will check for the active server on each load.

(1) above does not seem to work well for us, as it does not seem possible to perform the merge inside a transaction isolation snapshot (2) above seems quite cumbersome and involves copying several hundred GB of bulk data inside one massive transaction on a daily basis.

What are the best practices for achieving our objectives? We would love to have some advice from the community..!

Adam
  • 1,932
  • 2
  • 32
  • 57

1 Answers1

1

Few things that came to my mind / that aren't that clear in the post:

  • Why are you completely rebuilding the whole database? Isn't there any way to only build the part that has changed or does the whole data change?

  • Is the problem you have to shut down the application partly / completely just because of access to the same tables?

  • If blocking and / or access to the same tables wouldn't be a problem, would the performance otherwise be ok? Could you just build it on the same server on separate tables, maybe by adjusting MAXDOP so that it wouldn't be that heavy on the resources.

  • If you just have one database, could you build the data into a separate table, and then just replace the old data with the new data?

    • If you're using enterprise edition, partition switching might provide an easy trick.
    • Otherwise I can just come up with small or bigger hacks depending on your environment (installing new versions of views / aliases / procedure that access the tables, sp_rename, different schema etc)

This question is quite broad, and isn't that much related to programming, the dba site might even be better suited for this.

James Z
  • 12,209
  • 10
  • 24
  • 44
  • Thanks for your thoughts on this. We have denormalised data that contains fields from multiple source tables. Any one of these data items could have changes, making it difficult to do a Delta/NetChange build. "If you just have one database, could you build the data into a separate table, and then just replace the old data with the new data?" - this is pretty much the same as our approach (2), except we would use a different server to reduce the performance burden. Enterprise edition isn't available. Performance of production tables during build time is a key priority here. – Adam Oct 30 '15 at 16:06