How to regularly download Geo-Replicated Azure Database (PaaS) data to On-Premise database

Question

We have a geo-replicated database in Azure SQL (Platform as a Service). This is a master/slave type arrangement, so the geo-replicated database is read-only.

We want to download data regularly from this Azure SQL database to a SQL Server database on-premise that has the same schema, without it impacting performance too much (the Azure Database is the main database used by the application).

We originally looked at Azure SQL Data Sync, to hopefully read data from the geo-replicated data and pull that down to on-premise, but it needs to create triggers + tracking tables. I don't feel overly comfortable with this, because it won't be possible to run this against the read-only slave database, and so it must be setup on the transactional master database (impacting application performance), which in turn will re-create these extra data-sync artifacts on the geo-replicated database. It seems messy, with bloated data (we have a large number of tables and data, and Azure PaaS databases are limited in size as it is) and we also use Redgate database lifecycle management, which can potentially blow these schema objects and tracking tables away every time we perform a release, as they're not created by us and are not in our source control.

What other viable options are there (other then moving away from PaaS and making a clustered IaaS VM environment across on-prem and cloud, with SQL Server installed, patched, etc). Please keep in mind, we are resource stretched in terms of staff, which is why PaaS was an ideal place for our database originally.

I should mention, we want the On-Premise database to be 'relatively' in sync with the Azure database, but the data on-premise can be up to an hour old data.

Off the top of my head, some options may be SSIS packages? Or somehow regularly downloading a Bacpac of the database and restoring it on-premise every 30 mins (but it's a very large database).

Note, it only needs to be one-directional at this stage (Azure down to on-premise).

Can the on-premise be read only? Availability groups will do this, but I can't find an example of Azure SQL > On Prem availability group. Not even sure if it's possible. What's the purpose of the on prem? For a SSIS based replication solution you could try enabling change tracking, but I don't think you can do this on a read only replica. Both change tracking and CDC are ways to track changes (for replication) that don't require creating special objects — Nick.Mc, Feb 07 '18 at 05:59
Thanks for the comment. The on-premise database could be read-only. In fact, it could even be just a flat file or backup file somewhere, that we only restore when needed. It's main use will predominately be when we can't access the internet for some reason, and we want to gain access to recent data (kind of like a fail over option). — Stefan Zvonar, Feb 07 '18 at 06:20

score 1 · Answer 1 · answered Feb 07 '18 at 13:14

1

You can give it a try to Azure Data Factory since it allows you to append data to a destination table or invoke a stored procedure with custom logic during copy when SQL Server is used as a "sink". You can learn more here.

Azure Data Factory allows you to incrementally load data (delta) after an initial full data load by using a watermark column that has the last updated time stamp or an incrementing key. The delta loading solution loads the changed data between an old watermark and a new watermark. You can learn more how to do that with Azure Data Factory on this article.

Hope this helps.

answered Feb 07 '18 at 13:14

Alberto Morillo

13,893
2
24
30

This looks promising. Every table does have a rowversion/timestamp. I will have to review this further (can't right now). Would you happen to know if you can use Azure SQL Database as source and On Premise SQL Server as a sink? – Stefan Zvonar Feb 08 '18 at 01:21
Nevermind, I can see on their data stores page that both are supported: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overview#supported-data-stores-and-formats – Stefan Zvonar Feb 08 '18 at 02:03
Anything else just let me know. – Alberto Morillo Feb 08 '18 at 02:59

How to regularly download Geo-Replicated Azure Database (PaaS) data to On-Premise database

1 Answers1