I'm using external tables to load data from csv stored in a blob to a table in Azure SQL Data Warehouse. The csv uses a string delimiter (double quote), empty strings are represented as 2 double quotes ("").
I want the empty columns to be treated as NULL in the table. The external file format I use is set up with USE_TYPE_DEFAULT = FALSE, but this does not seem to work since empty columns are imported as empty strings. And this only tends to happen when the columns are strings, numeric columns are correctly converted to NULL.
I'm also importing a different csv which does not have a string delimiter using a different external file format and those empty columns are imported as NULL. So it looks like it has something to do with the STRING_DELIMITER option.
The csv:
col1;col2;col3;col4;col5;col6
"a";"b";"c";"1";"2";"3"
"d";"";"f";"4";"";"6"
The code of the external file format:
CREATE EXTERNAL FILE FORMAT eff_string_del
WITH (
FORMAT_TYPE = DELIMITEDTEXT
,FORMAT_OPTIONS(
FIELD_TERMINATOR = ';'
,STRING_DELIMITER = '0x22'
,FIRST_ROW = 2
,USE_TYPE_DEFAULT = False)
)
Code of the table using the external file format:
CREATE EXTERNAL TABLE dbo.test (
col1 varchar(1) null
,col2 varchar(1) null
,col3 varchar(1) null
,col4 int null
,col5 int null
,col6 int null
)
WITH (
DATA_SOURCE = [EDS]
,LOCATION = N'test.csv'
,FILE_FORMAT = eff_string_del
,REJECT_TYPE = VALUE
,REJECT_VALUE = 0
)
The result when querying the external table:
SELECT *
FROM [dbo].[test]
col1 col2 col3 col4 col5 col6
---- ---- ---- ----------- ----------- -----------
a b c 1 2 3
d f 4 NULL 6
Can someone please help me explain what is happening or what I'm doing wrong?