0

my code is running on a 32-bit JVM (JRE v1.6) on Windows 2008 Server (64-bit) with 128 GB of RAM and 64 cores. however, the maximum heap space i can ever specify is 1.5 GB. my code looks the following.

int numThreads = Runtime.getRuntime.availableProcessors();
List<Callable<Long>> tasks = new ArrayList<Callable<Long>>();
File dir = new File("/path/to/data");
File[] dataFiles = dir.listFiles();
for(File dataFile : dataFiles) {
 MyTask task = new MyTask(dataFile);
 tasks.add(task);
}
ExecutorService executor = Executors.newFixedThreadPoll(numThreads);
List<Future<Long>> results = executor.invokeAll(tasks);
long total = 0L;
for(Future<Long> result : results) {
 total += result.get();
}
System.out.println("total = " + total);
executor.shutdown();

this code throws an OutOfMemoryError. what i have done is changed the number of threads to be something smaller.

int numThreads = Runtime.getRuntime.availableProcessors();
if(numThreads < 1 || numThreads > 4) {
 numThreads = 4;
}

this revised code hasn't yet thrown an OutOfMemoryError, but, it is disappointing to me because there are so much resources (RAM and CPU resources) not being used. how can i try to maximize the resource usage in my environment?

most importantly, i'd like some feedback on a workaround regarding the 1.5 GB maximum heap space limitation. note, the Callable<Long> tasks are embarassingly parallel.

i have thought about creating a DOS bat file to iterate over my input files and then simply call

java -cp %CP% -Xms1024m -Xmx1536m net.analysis.MyProg %1

but this seems kind of quirky/kludgy (now i have to have logic in DOS bat to determine how many processes to create, and wait for those processes to finish before spawning new ones).

any help is appreciated.

Jane Wayne
  • 8,205
  • 17
  • 75
  • 120
  • possible duplicate of [java 1.6 32-bit min and max heap memory issue](http://stackoverflow.com/questions/18307977/java-1-6-32-bit-min-and-max-heap-memory-issue) – Peter Lawrey Aug 20 '13 at 16:57
  • This question has the same answer as your last question. I feel there is something you are missing, but I don't know why. – Peter Lawrey Aug 20 '13 at 16:58
  • @PeterLawrey it is not exactly the same, but it is related. i was going to post on that thread, but i have already marked that thread as answered. this thread deals with "workarounds" whereas the previous one dealt with understanding the memory limitations. – Jane Wayne Aug 20 '13 at 17:02
  • All the workarounds I can think of involve installing the 64-bit version of Java, it is also the solution. If I was a system admin and I bought a 128 GB machine I would insist on using 64-bit applications or it would make the machine a bit pointless. If your IT doesn't get that, you have a very basic problem. – Peter Lawrey Aug 20 '13 at 17:39

3 Answers3

3

A 32bit JVM maxes out at about 1.5GB heap space. You must switch to a 64-bit JVM, running on a 64-bit OS of course, to allocate more. This is a direct consequence of the fact that a 32-bit JVM uses 32-bit addresses. A 64bit JVM can access roughly between 2 and 4 billion times as much heap space as a 32-bit JVM.

Jim Garrison
  • 85,615
  • 20
  • 155
  • 190
  • yes, i know that. but please be sensitive to the fact that to get anything changed on this environment is a political uphill battle with bureaucratic land mines. – Jane Wayne Aug 20 '13 at 16:37
  • 2
    There are no other options for increasing heap memory available to a JVM. This is an addressing limitation of the 32-bit architecture. It is the reason 64-bit architectures exist. – Jim Garrison Aug 20 '13 at 16:56
  • 2
    @JaneWayne Then its a political problem - and in your case none that can be circumvented by technology. Accept its running crappily due to the limitations - simply point out the problem to the responsible parties and explain the solution. Then it becomes their problem and *every time* someone comes complain about the performance of the software point them to the (documented) issue. If this doesn't get you results... change company, seriously. – Durandal Aug 20 '13 at 16:59
  • @JaneWayne So really your problem is political, not technical, so we cannot help you. You cannot change the way 32-bit programs are emulated, nor would you want to if you could. – Peter Lawrey Aug 20 '13 at 17:00
  • @Durandal so there is no technical workaround? i can think of a few (e.g. like the bat script), but i'm wondering if anyone else has a better idea and/or faced a similar constraint and had a solution. – Jane Wayne Aug 20 '13 at 17:06
  • @JaneWayne Of course you can look for "technical workarounds", but experience teaches that in the long run its to *your disadvantage* to do that. Imagine you go with the bat, it generates more work for you and adds another point of failure - and more importantly - guess who gets blamed if it doesn't work properly at some point. IMO the right thing to do here is to rattle at the chairs that are obstructing the correct solution. But in the end its your call. – Durandal Aug 21 '13 at 15:24
3

Options:

  1. Switch to 64-bit JVM.
  2. Run a whole bunch of 32-bit JVMs, each executing a subset of the work that must be accomplished.
Rob
  • 6,247
  • 2
  • 25
  • 33
  • would the ProcessBuilder help? i was thinking of a DOS bat file, but i'd like to keep everying in Java as much as possible. when i use the ProcessBuilder, this creates a bunch of 32-bit JVMs each with its own 1.5 GB maximum heap, right? – Jane Wayne Aug 20 '13 at 17:08
  • Sure, ProcessBuilder can create each process that will do the work. This lets you have a stream to/from each process - use these streams to pass work to the process and get the results back. So you will have a "driver" app that will manage all the work and processes and the "worker" app that executes the tasks and returns the results. – Rob Aug 20 '13 at 17:43
0

Why don't you consider newCachedThreadPool(). I think it should be a good fit for your requirement and constraint. IT Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks.Calls to execute will reuse previously constructed threads if available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources.

Check the api doc for more info

GKP
  • 1,057
  • 1
  • 11
  • 24