3

I just have a few questions on achieving the $subject. I have an FTP location and I want to use a beam pipeline to read these files and do some processing. I basically want to read the file list from the FTP location every one minute and do the processing. Do you have any thoughts on this?

I have already written the pipeline for the processing part, just struggling with reading the FTP location every one minute.

Any help would be appreciated.

turingMan
  • 147
  • 1
  • 9

1 Answers1

3

You can do something like this with the GenerateSequence transform in Beam. It would be something like this:

pipeline.apply(GenerateSequence.from(0).withRate(1, standardMinutes(1))
    .apply(ParDo.of(new ListAllFilesInFtpFn(serverAddress))
    .apply(ParDo.of(new DownloadFilesFromFtpFn(serverAddress));

Does this make sense?

Pablo
  • 10,425
  • 1
  • 44
  • 67