0

Is it possible to dynamically copy all datasets from a BigQuery Project to Azure Synapse Analytics, then dynamically copy all tables within each dataset? I know we can dynamically copy all tables within a BigQuery dataset reference to this answered question Loop over of table names ADFv2, but is there a way to do it at the project level with the lookup activity to loop through all datasets? Is there a way to do a SELECT * to the datasets?

SELECT
*
FROM
gcp_project_name.dataset_name.INFORMATION_SCHEMA.TABLES
WHERE table_type = 'BASE TABLE'

According to Microsoft's Lookup activity in Azure Data Factory and Azure Synapse Analytics, this only reaches the dataset level. I also tried just putting in the GCP's project name into the Lookup activity's query, but it did not work, ref Understanding the "Not found: Dataset ### was not found in location US" error

Aswin
  • 4,090
  • 2
  • 4
  • 16
dwssc2023
  • 21
  • 6

1 Answers1

0
  • This can be done using two-level pipeline. I tried to repro this and below is the approach.

  • Take a lookup activity and take the Google big query as source dataset. In Query text box, enter the below query.

SELECT schema_name
FROM  `project_name`.INFORMATION_SCHEMA.SCHEMATA

This query will list the datasets in the project.

enter image description here

  • Add a for-each activity next to the lookup activity. In for-each settings' item , type @activity('Lookup1').output.value as a dynamic content.

enter image description here

  • Then inside for-each activity, take another lookup activity with same big Query dataset as source dataset. Type the below query as a dynamic content.
SELECT
*
FROM
gcp_project_name.dataset_name.@{item().schema_name}.TABLES
WHERE table_type = 'BASE TABLE'

This will give list of all tables within each dataset.

Since you cannot nest a for-each inside for-each in ADF, you can design a two-level pipeline where the outer pipeline with the outer ForEach loop iterates over an inner pipeline with the nested loop.

Refer the NiharikaMoola-MT's answer on this SO thread for Nested foreach in ADF.

Aswin
  • 4,090
  • 2
  • 4
  • 16