How to manage threads in Spring TaskExecutor framework

Question

I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel. However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.

Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel. In other words all running Runnable need to have different code, all "duplicates" should wait.

The problem is that there's no event/method/whatsoever when thread ends. I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended

java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.

Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.

I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?

score 1 · Answer 1 · answered Mar 10 '17 at 17:30

If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine. If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):

public class WorkerActor extends UntypedActor {
  public void onReceive(Object message) {
    if (message instanceof Runnable) {
      Runnable work = (Runnable) message;
      work.run();
    } else {
      // report an error
    }
  }
}

As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).

If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:

public class WorkerActor implements Runnable, Executor {
  Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
  LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
  boolean running = false;

  public synchronized void execute(Runnable job) {
    queue.put(job);
    if (!running) {
      executor.execute(this); // execute this worker, not job!
      running=true;
  }

  public void run() {
    for (;;) {
      Runnable work=null;
      synchronized (this) {
         work = queue.poll();
         if (work==null) {
           running = false;
           return;
         }
      }
      work.run();
    }
  }
}

When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).

Wouldn't you get an issue here that, if you submitted many jobs with the same code in a row, then most common pool threads would simply be looping waiting for the first one to get out of the way, and jobs submitted later for other codes would be blocked? — BarrySW19, Mar 11 '17 at 00:26
@BarrySW19 each user's job is queued to the WorkerActor with corresponding code and so does not occupy any pool thread. The actor itself can occupy at most one thread. So if at first many jobs with the same code were submitted, then only on pool thread is occupied with an WorkerActor which processes that jobs sequentially one by one. — Alexei Kaigorodov, Mar 11 '17 at 09:21
Ah, I get it - every time a job is submitted it causes the Actor to loop until its queue is emptied - any more jobs submitted while it's working obviously get picked up as part of that process. This avoids the less efficient scanning used to select the next task in my second suggestion. — BarrySW19, Mar 11 '17 at 10:55

BarrySW19 · Answer 2 · 2017-03-10T15:07:57.323

One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.

This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.

Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:

public class MultiPoolExecutor {
    private final Semaphore semaphore = new Semaphore(3);

    private final ConcurrentMap<String, ExecutorService> serviceMap 
            = new ConcurrentHashMap<>();

    public void submit(String code, Runnable job) {
        ExecutorService executorService = serviceMap.computeIfAbsent(
                code, (k) -> Executors.newSingleThreadExecutor());
        executorService.submit(() -> {
            semaphore.acquireUninterruptibly();
            try {
                job.run();
            } finally {
                semaphore.release();
            }
        });
    }
}

Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).

public class SubmissionService {
    private final ExecutorService executorService = Executors.newFixedThreadPool(5);
    private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
    private final List<CodedRunnable> jobs = new ArrayList<>();

    public void submit(CodedRunnable codedRunnable) {
        synchronized (jobs) {
            jobs.add(codedRunnable);
        }
        submitWaitingJobs();
    }

    private void submitWaitingJobs() {
        synchronized (jobs) {
            for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
                CodedRunnable nextJob = iter.next();
                AtomicBoolean latch = locks.computeIfAbsent(
                        nextJob.getCode(), (k) -> new AtomicBoolean(false));
                if(latch.compareAndSet(false, true)) {
                    iter.remove();
                    executorService.submit(() -> {
                        try {
                            nextJob.run();
                        } finally {
                            latch.set(false);
                            submitWaitingJobs();
                        }
                    });
                }
            }
        }
    }
}

The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.

I like this first approach but it would be better if ExecutorService is destroyed when not needed. — Marx, Mar 14 '17 at 12:23

How to manage threads in Spring TaskExecutor framework

2 Answers2