2

I want to create some generic pipeline where I can pass table name or custom SQL as input and load required data from BigQuery to SQL Server. This pipeline should handle a load of daily incremental data and the initial historical load (around 100 GB).

I am trying to create it through Apache Beam (Dataflow) where I am facing some challenges with coding but before deep dive into Dataflow development, I want to understand the best way to extract data from BigQuery and load it into any database either Oracle, SQL Server, Postgres, etc.? Is there any way other than dataflow that is the best-optimized way?

Nikhil Suthar
  • 2,289
  • 1
  • 6
  • 24
  • Where is the SQL server hosted? on cloud i.e. within GCP/ AWS / Azure OR on premise data centre with networked by vpn/dedicated connected. Given your use case is about 100gb of data hence reliability , scalability is important and hence above questions. – Bihag Kashikar Mar 31 '23 at 10:24
  • 1
    Does my article help? https://medium.com/google-cloud/replicate-data-from-bigquery-to-cloud-sql-2b23a08c52b1 – guillaume blaquiere Mar 31 '23 at 12:07
  • no my SQL Server is hosted at On-Prem @BihagKashikar – Nikhil Suthar Apr 03 '23 at 08:06
  • I'd say given the hosting is onpremise, you will be better off with leaner solution in GCP and all heavy lifting in onpremise sql server. in GCP, use scheduled queries to run EXPORT command intial and delta and then use https://www.cdata.com/kb/tech/googlecloudstorage-ssis-task-import-2008.rst to load data back in sql server. – Bihag Kashikar Apr 03 '23 at 10:26
  • Sorry but I have an existing SQL Server that can't move to GCP – Nikhil Suthar Apr 03 '23 at 10:43

1 Answers1

0

You can try Dataflow Templates (https://cloud.google.com/dataflow/docs/guides/templates/provided-templates), which provides some pre-defined Beam pipeline for BigQuery as the souce.

XQ Hu
  • 141
  • 4
  • it does not contain jdbc write from bigQuery – Nikhil Suthar Apr 03 '23 at 08:07
  • These templates are open-sourced. For example, https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v1/src/main/java/com/google/cloud/teleport/templates/JdbcToBigQuery.java. You could create one from BQ to Jdbc. – XQ Hu Apr 04 '23 at 12:59