3

I am getting following error while running a USQL Activity in the pipeline in ADF:

Error in Activity:

{"errorId":"E_CSC_USER_SYNTAXERROR","severity":"Error","component":"CSC",
    "source":"USER","message":"syntax error.
     Final statement did not end with a semicolon","details":"at token 'txt', line 3\r\nnear the ###:\r\n**************\r\nDECLARE @in string = \"/demo/SearchLog.txt\";\nDECLARE @out string = \"/scripts/Result.txt\";\nSearchLogProcessing.txt ### \n",
    "description":"Invalid syntax found in the script.",
    "resolution":"Correct the script syntax, using expected token(s) as a guide.","helpLink":"","filePath":"","lineNumber":3,
    "startOffset":109,"endOffset":112}].

Here is the code of output dataset, pipeline and USQL script which i am trying to execute in pipeline.

OutputDataset:

{
"name": "OutputDataLakeTable",
"properties": {
    "published": false,
    "type": "AzureDataLakeStore",
    "linkedServiceName": "LinkedServiceDestination",
    "typeProperties": {
        "folderPath": "scripts/"
    },
    "availability": {
        "frequency": "Hour",
        "interval": 1
    }
}

Pipeline:

{
    "name": "ComputeEventsByRegionPipeline",
    "properties": {
        "description": "This is a pipeline to compute events for en-gb locale and date less than 2012/02/19.",
        "activities": [
            {
                "type": "DataLakeAnalyticsU-SQL",
                "typeProperties": {
                    "script": "SearchLogProcessing.txt",
                    "scriptPath": "scripts\\",
                    "degreeOfParallelism": 3,
                    "priority": 100,
                    "parameters": {
                        "in": "/demo/SearchLog.txt",
                        "out": "/scripts/Result.txt"
                    }
                },
                "inputs": [
                    {
                        "name": "InputDataLakeTable"
                    }
                ],
                "outputs": [
                    {
                        "name": "OutputDataLakeTable"
                    }
                ],
                "policy": {
                    "timeout": "06:00:00",
                    "concurrency": 1,
                    "executionPriorityOrder": "NewestFirst",
                    "retry": 1
                },
                "scheduler": {
                    "frequency": "Minute",
                    "interval": 15
                },
                "name": "CopybyU-SQL",
                "linkedServiceName": "AzureDataLakeAnalyticsLinkedService"
            }
        ],
        "start": "2017-01-03T12:01:05.53Z",
        "end": "2017-01-03T13:01:05.53Z",
        "isPaused": false,
        "hubName": "denojaidbfactory_hub",
        "pipelineMode": "Scheduled"
    }
}

Here is my USQL Script which i am trying to execute using "DataLakeAnalyticsU-SQL" Activity Type.

@searchlog =
    EXTRACT UserId          int,
            Start           DateTime,
            Region          string,
            Query           string,
            Duration        int?,
            Urls            string,
            ClickedUrls     string
    FROM @in
    USING Extractors.Text(delimiter:'|');

@rs1 =
    SELECT Start, Region, Duration
    FROM @searchlog
WHERE Region == "kota";


OUTPUT @rs1   
    TO @out
      USING Outputters.Text(delimiter:'|');

Please suggest me how to resolve this issue.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Jai
  • 416
  • 6
  • 20

3 Answers3

7

Your script is missing the scriptLinkedService attribute. You also (currently) need to place the U-SQL script in Azure Blob Storage to run it successfully. Therefore you also need an AzureStorage Linked Service, for example:

{
    "name": "StorageLinkedService",
    "properties": {
        "description": "",
        "type": "AzureStorage",
        "typeProperties": {
            "connectionString": "DefaultEndpointsProtocol=https;AccountName=myAzureBlobStorageAccount;AccountKey=**********"
        }
    }
}

Create this linked service, replacing the Blob storage name myAzureBlobStorageAccount with your relevant Blob Storage account, then place the U-SQL script (SearchLogProcessing.txt) in a container there and try again. In my example pipeline below, I have a container called adlascripts in my Blob store and the script is in there:

Make sure the scriptPath is complete, as Alexandre mentioned. Start of the pipeline:

{
    "name": "ComputeEventsByRegionPipeline",
    "properties": {
        "description": "This is a pipeline to compute events for en-gb locale and date less than 2012/02/19.",
        "activities": [
            {
                "type": "DataLakeAnalyticsU-SQL",
                "typeProperties": {
                    "scriptPath": "adlascripts\\SearchLogProcessing.txt",
                    "scriptLinkedService": "StorageLinkedService",
                    "degreeOfParallelism": 3,
                    "priority": 100,
                    "parameters": {
                        "in": "/input/SearchLog.tsv",
                        "out": "/output/Result.tsv"
                    }
                },
...

The input and output .tsv files can be in the data lake and use the the AzureDataLakeStoreLinkedService linked service.

I can see you are trying to follow the demo from: https://learn.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity#script-definition. It is not the most intuitive demo and there seem to be some issues like where is the definition for StorageLinkedService?, where is SearchLogProcessing.txt? OK I found it by googling but there should be a link in the webpage. I got it to work but felt a bit like Harry Potter in the Half-Blood Prince.

wBob
  • 13,710
  • 3
  • 20
  • 37
  • Thanks wbob. It worked fine for me. But we can use only Azure Storage in script linked service not azure data lake store. – Jai Jan 04 '17 at 06:05
  • Yes, I got a very specific (and unusually helpful) error when trying to store the U-SQL script in ADLS: "scriptLinkedService `AzureDataStoreLinkedService` is not supported. Currently scriptLinkedService can only accept an Azure Storage linked service. Please use an Azure Storage linked service and put your script in the blob instead." – wBob Jan 04 '17 at 09:40
  • Great investigation, Harry-Bob :) – Alexandre Gattiker Jan 05 '17 at 09:07
  • 1
    In case you were thinking about using a Shared Access signature to access the bucket with the scripts so that you wouldn't have to deal with account keys, let me save some time by pointing out that this doesn't work either, same error message as above. The SAS url works fine in the browser though! – Gavin Campbell May 04 '17 at 18:00
1

Remove the script attribute in your U-SQL activity definition and provide the complete path to your script (including filename) in the scriptPath attribute.

Reference: https://learn.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity

  • If I do not use script attribute then it gives me an error that U-SQL script is not found, use script or scriptLinkedService. So, I have to use script attribute there. – Jai Jan 03 '17 at 13:03
  • I have used the "scripts\\SearchLogProcessing.txt" and the same file was present in ADLS. Still, I was getting the error that USQL script not found. – Jai Jan 03 '17 at 13:30
  • Use `scripts/SearchLogProcessing.txt` (forward slash) – Alexandre Gattiker Jan 03 '17 at 14:28
1

I had a similary issue, where Azure Data Factory would not recognize my script files. A way to avoid the whole issue, while not having to paste a lot of code, is to register a stored procedure. You can do it like this:

DROP PROCEDURE IF EXISTS master.dbo.sp_test;
CREATE PROCEDURE master.dbo.sp_test() 
AS  
BEGIN 

@searchlog =
EXTRACT UserId          int,
        Start           DateTime,
        Region          string,
        Query           string,
        Duration        int?,
        Urls            string,
        ClickedUrls     string
FROM @in
USING Extractors.Text(delimiter:'|');

@rs1 =
    SELECT Start, Region, Duration
    FROM @searchlog
WHERE Region == "kota";


OUTPUT @rs1   
    TO @out
      USING Outputters.Text(delimiter:'|');
END;

After running this, you can use

"script": "master.dbo.sp_test()"

in your JSON pipeline definition. Whenever you update the U-SQL script, simply re-run the definition of the procedure. Then there will be no need to copy script files to Blob Storage.

Øystein S
  • 536
  • 2
  • 11