0

I'm creating a data factory pipeline to copy data from blob storage to table storage. The table may or may not exist in table storage and needs to be created if it doesn't exist. The Authentication method for my table storage linked service must be a SAS token (the value of which I'm grabbing from Key Vault).

Everything about my linked service configuration and pipeline works fine, except that the pipeline fails if the table sink doesn't exist. I've tried a similar configuration using Account Key authentication that works, but I'm looking for a way to do this with SAS token authentication.

Current Configuration

This is my Linked Service configuration for the table storage account:

{
    "name": "Table Storage Account",
    "type": "Microsoft.DataFactory/factories/linkedservices",
    "properties": {
        "parameters": {
            "StorageAccountName": {
                "type": "string"
            },
            "TableName": {
                "type": "string"
            }
        },
        "annotations": [],
        "type": "AzureTableStorage",
        "typeProperties": {
            "sasUri": "https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}",
            "sasToken": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "Key Vault",
                    "type": "LinkedServiceReference"
                },
                "secretName": {
                    "value": "@{linkedService().StorageAccountName}-sas",
                    "type": "Expression"
                }
            }
        }
    }
}

These are the SAS Token settings:

sv=2021-12-02&ss=t&srt=sco&sp=rwlacu&se=2023-05-02T03:00:00Z&st=2023-04-19T16:09:39Z&spr=https&sig=[REDACTED]

This is the Data Set configuration used by the Copy Data activity:

{
    "name": "StorageTable",
    "properties": {
        "description": "Dataset for the azure table account.",
        "linkedServiceName": {
            "referenceName": "Table Storage Account",
            "type": "LinkedServiceReference",
            "parameters": {
                "StorageAccountName": {
                    "value": "@dataset().StorageAccountName",
                    "type": "Expression"
                },
                "TableName": {
                    "value": "@dataset().TableName",
                    "type": "Expression"
                }
            }
        },
        "parameters": {
            "StorageAccountName": {
                "type": "string"
            },
            "TableName": {
                "type": "string"
            }
        },
        "annotations": [],
        "type": "AzureTable",
        "schema": [],
        "typeProperties": {
            "tableName": {
                "value": "@dataset().TableName",
                "type": "Expression"
            }
        }
    },
    "type": "Microsoft.DataFactory/factories/datasets"
}

This is the sink configuration for the Copy Data activity (though nothing seems relevant here):

"sink": {
    "type": "AzureTableSink",
    "azureTableInsertType": "replace",
    "azureTablePartitionKeyName": {
        "value": "PartitionKey",
        "type": "Expression"
    },
    "azureTableRowKeyName": {
        "value": "RowKey",
        "type": "Expression"
    },
    "writeBatchSize": 10000
}

With this configuration, all connections can be validated successfully in the portal, but ADF won't create a table if it doesn't exist.

Example Error Message

ErrorCode=FailedStorageOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A storage operation failed with the following error '0:The table specified does not exist. RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z'.,Source=,''Type=Microsoft.WindowsAzure.Storage.StorageException,Message=0:The table specified does not exist. RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z,Source=Microsoft.WindowsAzure.Storage,StorageExtendedMessage=0:The table specified does not exist. RequestId:0c4fc844-d002-0001-0d87-738fd1000000 Time:2023-04-20T12:58:38.1032528Z,,'

Other Attempts

Update SAS URI

I changed the "sasUri" to https://@{linkedService().StorageAccountName}.table.core.windows.net (removing the table name), hoping that with the base account URI, data factory would figure out which REST URL paths to use depending on the operation.

This change broke the connection validation in the portal (which I ignored for testing purposes), but the pipelines still worked fine (probably because the Table Name was still provided in the Data Set). Unfortunately, it still did not create tables that do not exist.

humbleice
  • 816
  • 8
  • 15
  • can you share the error you are facing when you are trying with SAS URL? – Pratik Lad Apr 20 '23 at 11:05
  • @PratikLad, it was too long to add as a comment, but I've updated the description with an example error message. – humbleice Apr 20 '23 at 13:03
  • I tried the same thing as you not getting any error with sas url please check your sas taken once if you gave appropriate permissions – Pratik Lad Apr 20 '23 at 15:01
  • @PratikLad Here are the SAS token permissions: "ss=t&srt=sco&sp=rwlacu". Is there something I'm missing? I tested creating a table with the SAS token in Postman and it worked fine. In your test did the Copy Data activity create a table that didn't exist when sinking to table storage? – humbleice Apr 20 '23 at 17:09
  • 1
    Yes in my test it creates a table which is not exist in table storage while running pipeline. – Pratik Lad Apr 20 '23 at 17:17

1 Answers1

2

How can I configure Azure Data Factory to create Table Storage tables during a copy activity (using SAS token)?

Follow below process to create Table Storage tables during a copy activity (using SAS token and check if you missed any):

  • First grab the SAS token with below settings from storage account >> Security +networking >> shared access signature enter image description here Generate and copy the SAS token enter image description here

  • Create a key vault to store this SAS token as secret. In this key vault create a secret add name as storage_account_name-sas and secret as SAS token enter image description here enter image description here

  • Now created linked service with the above values and some linked service parameters. As we have created linked service parameters for storage account name and table name here, I provided SAS URL as https://@{linkedService().StorageAccountName}.table.core.windows.net/@{linkedService().TableName}

linked service settings:

enter image description here

  • Now create dataset for azure table storage with above linked service and then crate a dataset parameter for storage account name and table name in dataset. enter image description here Then add these parameters to respective linked service properties and table. enter image description here

  • Now created pipeline added copy activity in it and selected source as blob file. enter image description here Then passed values to dataset properties storage account name and table name as sampleblob5 which is not exist in table storage. enter image description here

Pipeline ran successfully:

enter image description here

Output:


Before running pipeline (sampleblob5 table is not exist): enter image description here

After running pipeline (Copy activity created sampleblob5 file): enter image description here

Pratik Lad
  • 4,343
  • 2
  • 3
  • 11
  • Thanks for proving this works as expected. The setup is basically identical to mine, so it demonstrates the SAS token isn't the issue. It turns out the storage account I'm trying to copy into is behind a private network, and it's likely I need a to setup a managed private endpoint for the account for things to work as expected. The other accounts I tested were on public networks, and my Postman requests were behind a VPN, so everything looked good except for the ADF activity. I'm still waiting for my DevOps guy to add the managed PE to verify this is the fix. – humbleice Apr 21 '23 at 14:42
  • I will update the answer accordingly if your storage account is under firewall you need to create managed endpoints to connect it. – Pratik Lad Apr 21 '23 at 14:53
  • @PratikLad this is what the issue was. After creating the managed private network and using the private network IR in my linked service, I was able to connect to the account via the SAS token. Thanks! – humbleice Apr 21 '23 at 21:13