Bulk Insert Parquet Files in Azure SQL

Question

I am trying to load a parquet file into Azure SQL Database based on an example here on SO. However, I'm getting a syntax error. I don't see much documentation on Microsoft website or enough information on Google. So, seeking help from experts here. FYI - I have already created the DATA_SOURCE.

Creating External Data Source:

CREATE EXTERNAL DATA SOURCE [my_azure_blob_storage]
WITH (
        LOCATION = N'abfss://xxxxxxx.dfs.core.windows.net', 
        CREDENTIAL = [myblobStorage] ,
        TYPE = BLOB_STORAGE
);

Doing BULK INSERT:

BULK INSERT [dbo].[Employees]
FROM 'gold/employees'
WITH
    (
        DATA_SOURCE = 'my_azure_blob_storage',
        FORMAT = 'PARQUET',
        FIRSTROW = 2
    );

And the error I am getting is:

Msg 102, Level 15, State 1, Line 6 Incorrect syntax near 'FORMAT'.

score 3 · Answer 1 · answered Sep 20 '21 at 18:01

3

Currently the only FORMAT supported in BULK INSERT or OPENROWSET is CSV.

You can use Azure Data Factory or Spark to bulk load SQL Server from a parquet file, or to prepare a CSV file for BULK INSERT or OPENROWSET.

answered Sep 20 '21 at 18:01

David Browne - Microsoft

80,331
6
39
67

David - thanks for the response. Is MSFT considering adding this feature to Azure SQL in the near future? Also, when you say Spark, are you referring to Databricks? – Julaayi Sep 20 '21 at 18:09
1

Increasing support for Parquet across the ecosystem is something MSFT is working on, but there have been no roadmap announcements on this AFAIK. Azure Databricks or Synapse Spark (https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-overview) are the two managed Spark services on Azure. Although Spark is an Apache project and you can install it wherever you want. – David Browne - Microsoft Sep 20 '21 at 18:24

Bulk Insert Parquet Files in Azure SQL

1 Answers1