0

Bisuness Logic

We have the following business logic to accomplish:

1 million times do:

  1. download the file in index i
  2. unzip the file
  3. extract some info from the file
  4. delete the file

Current Akka solution

The Akka solution that we have at the moment creates 1 million actors who are responsible for one file to download and once they are done they create an actor to take care of steps 2,3,4.

The problem

Once we run the process, we came across the situation where the Akka gives top priority to the download actors, and the rest of the actors are beeing in starvation mode.

We know that as the machine disk is getting full because the download actors are constantly downloading, but the other actors don't get a chance to scan and delete the files.

Questions

  1. Is there a way to force Akka not to starve the actors down the actors chain?
  2. Is there a way to tell a download actor to wait until it get some notification that it can continue (e.g. no more than parallel 1000 files in the disk)

Thanks.

t-rex-50
  • 201
  • 1
  • 4
  • 10
  • As for question 2: Use only 1000 actors, which are responsible for all the steps -> 1000 files at max on disk, no possible starvation, since one actor is responsible for the whole step chain. You can use a queue with all the files to be downloaded, so that the 1000 actors can ask for a new file, once they are done with theirs. – thwiegan Jun 21 '17 at 08:35
  • 1000000 actors downloading 1000000 files simultaneously is a nonsense. Single thread would do this work much faster. Steps 2-4 should be combined in a single job and submitted to a thread pool of size availableProcessors+1. – Alexei Kaigorodov Jun 21 '17 at 10:12

1 Answers1

1

Use different dispatchers for the two types of actor:

In your config you can define a separate dispatcher as (for example):

my-dispatcher {
  type = Dispatcher
  executor = "thread-pool-executor"
  thread-pool-executor {
    fixed-pool-size = 32
  }
  throughput = 100
}

And then you can assign that to a specific actor at creation:

val myActor = context.actorOf(Props[MyActor].withDispatcher("my-dispatcher"), "myactor1")

Dispatchers are, effectively, thread-pools. Separating the two guarantees that the slow, blocking operations don't starve the other. This approach, in general, is referred to as bulk-heading, because the idea is that if a part of the app fails, the rest remains responsive.

For more info, see the documentation

Diego Martinoia
  • 4,592
  • 1
  • 17
  • 36