9

I have a pipeline that retrieves an FTP hosted CSV file. It is comma delimited with double quote identifiers. The issue exists where a string is encapsulated in double quotes, but the string itself contains double quotes.

string example: "Spring Sale" this year.

How it looks in the csv (followed and lead by two null columns):

"","""Spring Sale"" this year",""

SSIS handles this fine, but Data Factory wants to transform it into an extra column that isn't separated by a comma. I have removed the extra quotes on this line and it works fine.

Is there a way around this besides altering the source?

Kranthi Pakala
  • 1,288
  • 5
  • 10
Dan Peter
  • 101
  • 1
  • 2
  • 8

1 Answers1

16

I got this to work using Escape character set as quote (") with the Azure Data Factory Copy Task. Screen shot:

ADF Copy Task

This was based on a file as per your spec:

"","""Spring Sale"" this year",""

and also worked as in insert into an Azure SQL Database table. The sample JSON:

{
    "name": "DelimitedText1",
    "properties": {
        "linkedServiceName": {
            "referenceName": "linkedService2",
            "type": "LinkedServiceReference"
        },
        "type": "DelimitedText",
        "typeProperties": {
            "location": {
                "type": "AzureBlobStorageLocation",
                "fileName": "quotes.txt",
                "container": "someContainer"
            },
            "columnDelimiter": ",",
            "escapeChar": "\"",
            "quoteChar": "\""
        },
        "schema": [
            {
                "name": "Prop_0",
                "type": "String"
            },
            {
                "name": "Prop_1",
                "type": "String"
            },
            {
                "name": "Prop_2",
                "type": "String"
            }
        ]
    }
}

Maybe the example file is too simple but it did work for me in this configuration.

Alternately, just use SSIS and host it in Data Factory.

wBob
  • 13,710
  • 3
  • 20
  • 37
  • 1
    I have since torn down the environment and can't validate but i dont recall that i tried this. We abandoned ADF due to lack of features and haven't researched SSIS runtime yet. I will remember this if we revisit it. Thanks for your input. – Dan Peter Jun 25 '19 at 20:30
  • 2
    This does work, and properly supports the CSV RFC https://tools.ietf.org/html/rfc4180 Azure Data Factory should have this enabled by default to " and not \ – Toby Sep 26 '19 at 18:47
  • Does not work for me either :( I have no control about incoming values, they just can and do contain quotes. Any ideas? – Petr Aug 04 '21 at 12:18
  • 1
    Thank you very much for posting this solution. It helped me a lot. You have many beers from me :))) – Iliyan Popov Feb 15 '22 at 12:41