-1

I have 1000k blobs. I need to remove empty lines in their content and overwrite. To achieve the goal, I plan to use:

  1. Azure webjob
  2. DataFactory
  3. Azure Batch

Could you give me some advise which service is best fit for this case?

jeb
  • 78,592
  • 17
  • 171
  • 225
duy
  • 579
  • 5
  • 16

2 Answers2

1

Use Azure Functions.

Easily build the apps you need using simple, serverless functions that scale to meet demand. Use the programming language of your choice, and don’t worry about servers or infrastructure.

Data Factory is a data integration service meant to create, schedule, and manage your data integration. This is not the tool for removing empty lines in files.

Azure Batch is a High Performance Computing solution which can spin up lots of VM's. This is very much overkill for removing empty lines from files.

rickvdbosch
  • 14,105
  • 2
  • 40
  • 53
1

Someone suggested Azure Functions already. I would add more to it. Use Specifically Azure Durable function and parallelize the tasks called fan in/fan out.

All you need to track which Blob item you have processed.It would definitely be cheaper and quicker. You can find more about here and example specifically talk about parallelizing the tasks for blob.

https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-cloud-backup

Imran Arshad
  • 3,794
  • 2
  • 22
  • 27