Java Concurrent Multi-Threaded Algorithm in a Web App - Slower Than Expected

Question

I ran into an issue in my Java web application today and need a 2nd (and 3rd, and 4th) set of eyeballs to see where I've messed up and how best to fix it.

The web app collects bids each round of an auction, and then fills in missing bids for bidders at the end of the round. I run the solving portion in parallel to speed it up. There are approximately 50 bidders bidding on 30 products.

Here's the pseudo-code of the application mixed with the actual Java code...

public void generateResults(final int round)
{
    // pseudo-code to retrieve all the bids, takes about 800ms
    final Bids bids = DB.getBids();  
    final Users bidders = UserService.lookup(UserData.BIDDER);

    ExecutorService exec = 
          Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() + 1);
    Collection<Callable<BidSet>> tasks = new ArrayList<Callable<BidSet>>();

    for (int i=0; i<bidders.size(); i++)
    {
        tasks.add(new Callable<BidSet>()
        {
           public Bids call()
           {
             User bidder = Users.get(bidders.get(i));
             Bids bidsForUser = bids.findByUser(bidder.id);
             // do some manipulation of the bids with some DB calls
             Bids missingBids = // result of manipulated bids above
             Bids.store(missingBids);
           }
        });
    }

    List<Future<Bids>> results = exec.invokeAll(tasks);
}

The code runs as it should, and it writes to the DB correctly (an indexed InnoDB table). In my local tests on a 6-core machine, this algorithm will take about 3.3 seconds to complete. However, today on the server, which is a 4-core machine, this algorithm was taking 35 seconds to run. Not good.

My questions/concerns center on a few things

Big question - Why did it take so much longer to run on the server than locally? It was crunching the exact same data. In fact, it was running against exact copies of the same DB.
Locally, there are no users besides myself, so Tomcat is not fielding any requests from users looking to load pages and all the CPU can be devoted to crunching. Would having say 20 users load pages cause a huge pile-up on the server and lock everything down?
Is the number of processors I use in my FixedThreadPool wrong? Is it too many? Am I locking out the other necessary resources on the machine (DB threads and Tomcat HTTP threads) from operating? What should I change it to?
Is it too many writes to the DB concurrently? 50 bidders x 30 products would mean 1500 writes to the InnoDB table in a few seconds. flush_log_at_trx_commit=2 though.
Could I change the expected results with different hardware? 8-core machine for example.

Based on my observation today, the CPU spiked to 100% on every core when this algorithm was operating. I don't know the page load times of the users while this algorithm crunched. Thanks.

[Have you tried to run the code in a profiler?](https://www.ej-technologies.com/products/jprofiler/overview.html) — durron597, May 13 '14 at 22:28
"We were completely wrong. Aside from having a really interesting conversation, we were doing no good at all. The lesson is: Even if you know exactly what is going on in your system, measure performance, don't speculate. You'll learn something, and nine times out of ten, it won't be that you were right!" - http://sourcemaking.com/refactoring/refactoring-and-design (Martin Fowler) — durron597, May 13 '14 at 22:33
Profiling is a good suggestion - but the crux of the question is whether the thread count in the ExecutorFactory is too high? That's the recommended number in Java Concurrency in Practice, but it seems wrong for a web application. — bluedevil2k, May 13 '14 at 22:59
Be sure that this server use only your web app. may be part of resources is busy by another application. ANd take a profiler — Grigoriev Nick, May 14 '14 at 09:35

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

1

Wait a minute, you're creating a new ExecutorService on every single web request?

The entire idea behind ExecutorService is that it exists on the main server and then each request can use it to gain access to asynchronous operations. Here's the relevant section of JSR236

2.1 Container-Managed vs. Unmanaged Threads

Java EE application servers require resource management in order to centralize administration and protect application components from consuming unneeded resources. This can be achieved through the pooling of resources and managing a resource’s lifecycle. Using Java SE concurrency utilities such as the java.util.concurrent API, java.lang.Thread and java.util.Timer in a server application component such as a servlet or EJB are problematic since the container and server have no knowledge of these resources.

By extending the java.util.concurrent API, application servers and Java EE containers can become aware of the resources that are used and provide the proper execution context for the asynchronous operations.

This is largely achieved by providing managed versions of the predominant java.util.concurrent.ExecutorService interfaces.

edited Jun 20 '20 at 09:12

Community

1
1

answered May 13 '14 at 23:13

durron597

31,968
17
99
158

No, it only creates the {ExecutorService} once, when the round is over, and a background thread kicks off the algorithm. – bluedevil2k May 13 '14 at 23:16
1

@bluedevil2k Right, it should create the `ExecutorService` once in the background and just send `Runnable`s (or `Callable`s) to it at the appropriate time – durron597 May 13 '14 at 23:18
Would that even matter? The Executor would get gc'ed at the end of the function and creating a new one is virtually free. – bluedevil2k May 13 '14 at 23:33
It might; if the threads don't get gced and suddenly there are 2x or 3x as many threads as processors... But honestly my original thought is best; get a profiler – durron597 May 13 '14 at 23:39

Java Concurrent Multi-Threaded Algorithm in a Web App - Slower Than Expected

1 Answers1

2.1 Container-Managed vs. Unmanaged Threads