I am struggling a lot with this task. I have to download files from SFTP and then parse them. I am using Durable functions like this
[FunctionName("MainOrch")]
public async Task<List<string>> RunOrchestrator(
[OrchestrationTrigger] IDurableOrchestrationContext context, ILogger log)
{
try
{
var filesDownloaded = new List<string>();
var filesUploaded = new List<string>();
var files = await context.CallActivityAsync<List<string>>("SFTPGetListOfFiles", null);
log.LogInformation("!!!!FilesFound*******!!!!!" + files.Count);
if (files.Count > 0)
{
foreach (var fileName in files)
{
filesDownloaded.Add(await context.CallActivityAsync<string>("SFTPDownload", fileName));
}
var parsingTasks = new List<Task<string>>(filesDownloaded.Count);
foreach (var downlaoded in filesDownloaded)
{
var parsingTask = context.CallActivityAsync<string>("PBARParsing", downlaoded);
parsingTasks.Add(parsingTask);
}
await Task.WhenAll(parsingTasks);
}
return filesDownloaded;
}
catch (Exception ex)
{
throw;
}
}
SFTPGetListOfFiles: This functions connects to SFTP and gets the list of files in a folder and return.
SFTPDownload: This function is suppose to connect to SFTP and download each file in Azure Function's Tempt Storage. and return the download path. (each file is from 10 to 60 MB)
[FunctionName("SFTPDownload")]
public async Task<string> SFTPDownload([ActivityTrigger] string name, ILogger log, Microsoft.Azure.WebJobs.ExecutionContext context)
{
var downloadPath = "";
try
{
using (var session = new Session())
{
try
{
session.ExecutablePath = Path.Combine(context.FunctionAppDirectory, "winscp.exe");
session.Open(GetOptions(context));
log.LogInformation("!!!!!!!!!!!!!!Connected For Download!!!!!!!!!!!!!!!");
TransferOptions transferOptions = new TransferOptions();
transferOptions.TransferMode = TransferMode.Binary;
downloadPath = Path.Combine(Path.GetTempPath(), name);
log.LogInformation("Downloading " + name);
var transferResult = session.GetFiles("/Receive/" + name, downloadPath, false, transferOptions);
log.LogInformation("Downloaded " + name);
// Throw on any error
transferResult.Check();
log.LogInformation("!!!!!!!!!!!!!!Completed Download !!!!!!!!!!!!!!!!");
}
catch (Exception ex)
{
log.LogError(ex.Message);
}
finally
{
session.Close();
}
}
}
catch (Exception ex)
{
log.LogError(ex.Message);
_traceService.TraceException(ex);
}
return downloadPath;
}
PBARParsing: function has to get the stream of that file and process it (processing a 60 MB file might take few minutes on Scale up of S2 and Scale out with 10 instances.)
[FunctionName("PBARParsing")]
public async Task PBARParsing([ActivityTrigger] string pathOfFile,
ILogger log)
{
var theSplit = pathOfFile.Split("\\");
var name = theSplit[theSplit.Length - 1];
try
{
log.LogInformation("**********Starting" + name);
Stream stream = File.OpenRead(pathOfFile);
i want the download of all files to be completed using SFTPDownload thats why "await" is in a loop. and then i want parsing to run in parallel.
Question 1: Does the code in MainOrch function seems correct for doing these 3 things 1)getting the names of files, 2) downloading them one by one and not starting the parsing function until all files are downloaded. and then 3)parsing the files in parallel. ?
I observed that what i mentioned in Question 1 is working as expected.
Question 2: 30% of the files are parsed and for the 80% i see errors that "Could not find file 'D:\local\Temp\fileName'" is azure function removing the files after i place them ? is there any other approach i can take? If i change the path to "D:\home" i might see "File is being used by another process" error. but i haven't tried it yet. out the 68 files on SFTP weirdly last 20 ran and first 40 files were not found at that path and this is in sequence.
Question3: I also see this error " Singleton lock renewal failed for blob 'func-eres-integration-dev/host' with error code 409: LeaseIdMismatchWithLeaseOperation. The last successful renewal completed at 2020-08-08T17:57:10.494Z (46005 milliseconds ago) with a duration of 155 milliseconds. The lease period was 15000 milliseconds." does it tells something ? it came just once though.
update after using "D:\home" i am not getting file not found errors