3

The idea is to run a parallel job on a 96-cores machine, with a work stealing ForkJoinPool.

Below is the code I'm using so far:

import scala.collection.parallel.ForkJoinTaskSupport
import scala.concurrent.forkjoin.ForkJoinPool

val sequence: ParSeq[Item] = getItems().par
sequence.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool())
val results = for {
  item <- sequence
  res   = doSomethingWith(item)
} yield res

Here, sequence has about 20,000 items. Most items take 2-8 seconds to process, and only about 200 of them take longer, around 40 seconds.

The problem:

Everything runs fine, however, the work-stealing aspect doesn't seem to work well. Here are the expected total CPU load (black) compared to the actual load (blue) over time:

Expected vs Actual work loads

When looking at the CPU activity, it's very clear that less and less cores get used as the job is progressing towards the end. During the last 10 few minutes, only 2 or 3 cores are still busy processing dozens of items sequentially, one after the other.

How comes that the items still in the queue don't get stolen by the other free cores, even when using a ForkJoinPool, which is supposed to be work-stealing?

https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ForkJoinPool.html

Jivan
  • 21,522
  • 15
  • 80
  • 131
  • Sooo... there's an accepted answer to _why_ it's not work stealing. Is there something that can be done about it? – Stephen Feb 16 '22 at 08:40

1 Answers1

1

Each worker thread has its internal task queue, which is protected from work stealing from other threads to limit interactions between workers.

This probably explains the behavior you're seeing, especially if the occurrences of long task in your item set isn't random.

C4stor
  • 8,355
  • 6
  • 29
  • 47
  • Doesn't this contradict the documentation? _A ForkJoinPool differs from other kinds of ExecutorService mainly by virtue of employing work-stealing: all threads in the pool attempt to find and execute tasks submitted to the pool and/or created by other active tasks_ – Jivan Feb 01 '18 at 17:05
  • 1
    As you write, stealable tasks are the one : still in the common pool, or explicitly created by other active tasks. So, I suppose that "doSomethingWithItem()" doesn't actually create sub tasks (but you tell me !), and at this point, none of these conditions are met, so no tasks are stealable. – C4stor Feb 01 '18 at 17:14
  • 1
    Digging a bit, I found this article : http://www.h-online.com/developer/features/The-fork-join-framework-in-Java-7-1762357.html to be a decent explanation of what's going on in a forkjoinpool (at least better than what I could write :D ) Also, there's a method getStealCount() for your pool, which you could use to monitor actual stealing happening (my guess is none :D ) – C4stor Feb 01 '18 at 17:17