0

We have been using Synapse for some time and are primarily using the Serverless Pool with Parquet, External Tables, and Views. I have a few large views that read multiple external tables and take a long time to generate. I am hoping I might be able to schedule a job to generate these as parquet files that could be read by another external table.

I am wondering if there is a way I might be able to generate a parquet file via a SQL query on the Serverless Pool.

I believe I see a way to do it using a Spark Pool, but I was curious if there might be a way to use the Serverless Pool since I think that would be a cheaper option in the long run.

Matt
  • 53
  • 1
  • 5
  • CETAS can create Parquet files: https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/develop-tables-cetas - but it also creates an External Table reference in a database, which you may not want or need. And automating that query will be more difficult than automating a Notebook that can also easily create Parquet files. – Joel Cochran Nov 07 '22 at 17:39

1 Answers1

0

You can create Data Factory Pipeline with copy activity. Copy activity source dataset needs to point to Serverless Pool DB, along with the query that needs to run. Copy activity Sink can be ADLS2 folder in Parquet format. You can schedule this pipeline through triggers as needed.

enter image description here

Satya V
  • 1
  • 3