0

I am trying to Run an U-SQL script on Azure by C# code. Everything is created on azure (ADF, linkedservices, pipelines, data sets) after code gets executed but U-SQl script is not executed by ADF. I think there is an issue with startTime and end Time configured in pipeline code.

I followed following article to complete this console application. Create, monitor, and manage Azure data factories using Data Factory .NET SDK

Here is the URL of my complete C# code project for download. https://1drv.ms/u/s!AltdTyVEmoG2ijOupx-EjCM-8Zk4

Someone please help me to find out my mistake

C# code to configure pipeline:

DateTime PipelineActivePeriodStartTime = new DateTime(2017, 1, 12, 0, 0, 0, 0, DateTimeKind.Utc); DateTime PipelineActivePeriodEndTime = PipelineActivePeriodStartTime.AddMinutes(60); string PipelineName = "ComputeEventsByRegionPipeline";

        var usqlparams = new Dictionary<string, string>();
        usqlparams.Add("in", "/Samples/Data/SearchLog.tsv");
        usqlparams.Add("out", "/Output/testdemo1.tsv");

        client.Pipelines.CreateOrUpdate(resourceGroupName, dataFactoryName,
        new PipelineCreateOrUpdateParameters()
        {
            Pipeline = new Pipeline()
            {
                Name = PipelineName,
                Properties = new PipelineProperties()
                {
                    Description = "This is a demo pipe line.",

                    // Initial value for pipeline's active period. With this, you won't need to set slice status
                    Start = PipelineActivePeriodStartTime,
                    End = PipelineActivePeriodEndTime,
                    IsPaused = false,

                    Activities = new List<Activity>()
                    {
                        new Activity()
                        {
                            TypeProperties = new DataLakeAnalyticsUSQLActivity("@searchlog = EXTRACT UserId int, Start DateTime, Region string, Query string, Duration int?, Urls string, ClickedUrls string FROM @in USING Extractors.Tsv(nullEscape:\"#NULL#\"); @rs1 = SELECT Start, Region, Duration FROM @searchlog; OUTPUT @rs1 TO @out USING Outputters.Tsv(quoting:false);")
                            {
                                DegreeOfParallelism = 3,
                                Priority = 100,
                                Parameters = usqlparams
                            },
                            Inputs = new List<ActivityInput>()
                            {
                                new ActivityInput(Dataset_Source)
                            },
                            Outputs = new List<ActivityOutput>()
                            {
                                new ActivityOutput(Dataset_Destination)
                            },
                            Policy = new ActivityPolicy()
                            {
                                Timeout = new TimeSpan(6,0,0),
                                Concurrency = 1,
                                ExecutionPriorityOrder = ExecutionPriorityOrder.NewestFirst,
                                Retry = 1
                            },
                            Scheduler = new Scheduler()
                            {
                                Frequency = "Day",
                                Interval = 1
                            },
                            Name = "EventsByRegion",
                            LinkedServiceName = "AzureDataLakeAnalyticsLinkedService"
                        }
                    }
                }
            }
        });

I just noticed something in azure data factory view (Monitor and Manage option). The status of Pipeline is Waiting : DatasetDependencies. Azure Data Factory Monitor and Manage view Do I need to modify something in code for this?

Kishan Gupta
  • 586
  • 1
  • 5
  • 18
  • I am sorry, but if you cannot post a relevant portion of the code here then it is hard or very timeconsuming to help. – Peter Bons Jan 16 '17 at 13:00
  • Hi, didn't you post the same question last week? What happened to that one? – wBob Jan 16 '17 at 13:05
  • @wBob yes, but this time I have shared the complete C# console project so that anyone can download and look into the whole code. – Kishan Gupta Jan 16 '17 at 14:44
  • What do you mean more precisely by "your U-SQL script is not executed"? Do you get an error, or do you mean the activity never executes? Can you share your pipeline definition? Have you looked at your scheduled activity windows in https://learn.microsoft.com/en-us/azure/data-factory/data-factory-monitor-manage-pipelines? – Alexandre Gattiker Jan 16 '17 at 14:50
  • I mean, the activity which is configured inside pipeline is not getting executed on azure. Pipeline code is updated in question. – Kishan Gupta Jan 16 '17 at 14:55
  • I am expecting U-SQL query (activity) to run automatically and output file into output folder in data lake store after C# code executes but it is not happening. – Kishan Gupta Jan 16 '17 at 15:01

1 Answers1

2

If you don't have another activity that is creating your source dataset, you need to add to it the attribute

"external": true

https://learn.microsoft.com/en-us/azure/data-factory/data-factory-faq

https://learn.microsoft.com/en-us/azure/data-factory/data-factory-create-datasets