0

We are creating a Federated Datawarehouse using snowflake i,e i will have dedicated DWH on each specific region say 3 regions. I will have one global DWH in a separate region for which we need to take data from tables from the other region DWH for reporting. What would be the best approach to accomplish the same?

I read and understood that, you can unload data from DWH in a region into AWS S3 or AZURE Blob on the same region. This i have to do it for all 3 regions. Then we have an option in AWS S3 cross-region replication which i can enable and then load it into Global DWH.

This was my approach and seems bit long and might cost extra for cross region data transfer which anyway is required. Mainly i will not be able to create a flow end to end. Since all are in different region, i need to run a separate job to unload it to s3 in that region, need to validate and need to start loading once all 3 unload completes. Workflow or orchestration also a problem. I considered AWS Batch and Step function but both regional services.

Appreciate if someone can through some light and options? Thank you!

2 Answers2

0

I wouldn't advise doing it with S3 and loading into each database.

You have two options with Snowflake which are much more suited to your use case. One which Rich has already mentioned (Database Replication). You also have the option to use Data Sharing via the Private Data Exchange (not available in all regions yet) or by using Data Sharing

Simon D
  • 5,730
  • 2
  • 17
  • 31
  • Hi Simon, thank you for your comment. Data Sharing works only within that region. Replication will replicate all the data which add up more cost. – Muthukumar Apr 29 '20 at 07:28
  • You're right about data sharing, that wouldn't suit you. You could try to create a new database with the objects you want to replicate and set up data replication only on that database? Every time you need to refresh the data to the other region just run an `insert overwrite` statement to the tables in the master replicate? – Simon D Apr 29 '20 at 08:57
  • Thanks Simon, yes same i was thinking to have a separate DB to get replicated. Looks like this will be the most suited and easy one. Thanks again! :) – Muthukumar Apr 29 '20 at 09:22
-1

I suggest you review Snowflake's documentation on Database Replication, and ask your account executive or sales engineer to discuss it with you, it seems like a perfect fit for your use case.

https://docs.snowflake.com/en/user-guide/database-replication-failover.html

I hope this helps...Rich Murnane

Rich Murnane
  • 2,697
  • 1
  • 11
  • 23
  • Hi Rich, Thank you for your comment. It is not a DR fail over or we dont need to replicate all data. Only the required data / table from each region DWH need to be there on the Global DWH for reporting. This will add up more storage cost. – Muthukumar Apr 29 '20 at 07:27