0

I have been trying to create an external table in Azure SQL Data Warehouse by considering only some folders instead of all from the azure Datalake location. But I'm not able to do so. Request for help in this regard.

Below is the example for this scenario.

In Azure Datalake, the location consists of the folders with data split into multiple folders based on years criteria. The folders exist from the year 1996 to 2020. My scenario is to create an external table only for the years 2018 and above.

Please suggest the best approach to accomplish this.

Thank you!!!

1 Answers1

0

You have two main options:

1. Create one external table per year then union together selects from several external tables

For example you could do something conceptually like:

CREATE EXTERNAL TABLE FILES2018...
CREATE EXTERNAL TABLE FILES2019...
CREATE EXTERNAL TABLE FILES2020...

SELECT * FROM FILES2018
UNION ALL
SELECT * FROM FILES2019
UNION ALL
SELECT * FROM FILES2020

2. Use the new COPY command

This example shows how to do multiple folders.

COPY INTO t1
FROM 
    'https://myaccount.blob.core.windows.net/myblobcontainer/folder2018', 
    'https://myaccount.blob.core.windows.net/myblobcontainer/folder2019', 
    'https://myaccount.blob.core.windows.net/myblobcontainer/folder2020'
WITH...
GregGalloway
  • 11,355
  • 3
  • 16
  • 47
  • Hi Greg, Thanks for your response. I have tried the second approach already and was getting "Index was out of range. Must be non-negative and less than the size of the collection. Parameter name: startIndex" error. My source files are in parquet format. Not sure how to debug this error. Any suggestions are highly appreciated. – Deekshit Apr 13 '20 at 12:09
  • COPY INTO TBL_NAME FROM 'DATALAKE FILES PATH' WITH ( FILE_FORMAT = parquetFileFormat (Created this fileformat based on documentation), CREDENTIAL=(IDENTITY = '*******', SECRET = '***************') ) – Deekshit Apr 13 '20 at 13:38
  • @Deekshit I would start with a path pointing at one file. And if that doesn’t work I would try creating a regular external table for one file to confirm you have the right columns, right column order and column count. – GregGalloway Apr 13 '20 at 14:02