1

Azure Batch on its own doesn't have MapReduce APIs like Spark. I am wondering if any of the Runners provided by Apache Beam can be used to start a MapReduce job on the pool of VMs created under Azure Batch.

Has anyone got Apache Beam APIs to work against Azure Batch?

By the way, yes, I've looked at this (https://github.com/Azure-Samples/azure-batch-samples/blob/master/CSharp/TextSearch/readme.md) but this is not exactly MapReduce. It simply splits the file into many files. Doesn't process each line.

DilTeam
  • 2,551
  • 9
  • 42
  • 69
  • 1
    I do not think this is possible at the moment, but that would be great to have for Beam / Azure Batch. Feel free to request it at dev@beam.apache.org, or via Azure's channels : ) – Pablo Sep 07 '18 at 22:52
  • Problem seems to be that Azure Batch simply gives us a bunch of VMs but these don't talk to each other. There's no Resource Manager such as Yarn or Mesos; so I am not sure if anything can be done by Apache Beam. Although, I would be curios to find out. – DilTeam Sep 12 '18 at 05:55

0 Answers0