2

I'm currently tasked to load about 3600 small csv files from Azure Data Lake into a single table in Azure Synapse. Each file is less than 10k bytes. I used Polybase to create an external table to point to a folder on Data lake with some wildcards and inserted records from the external table into another staging table. The whole process took about 4hrs to complete. Is there anyway to see if there is any parallelism taking place? Is the time taken normal?

Victor Ng
  • 71
  • 3
  • What DWU are you running at? What resource class is associated with the account you are using to load the data? How did you insert the records, eg INSERT or CTAS? CTAS would probably be recommended. – wBob Nov 06 '20 at 10:04
  • you might try the new copy into command... Are the CSV files compressed also ensure that you aren't running the queries with an account that is a dbo as this causes the operation to be run as a small resource class which may be suboptimal – Jason Horner Nov 10 '20 at 01:55
  • The data warehouse is currently at dw200c and I'm using insert statement. The CSV files are not compressed and left as plaintext on the data lake. How do I check whether the account is a dbo? – Victor Ng Nov 12 '20 at 03:25
  • @VictorNg, were you able to solve this? – Nick May 20 '21 at 09:07

0 Answers0