I understand that a larger heap means longer GC pauses. I'm okay with that -- my code is doing analysis of some data, and all I care about is minimizing the time spent doing garbage collection, the length of a single pause doesn't make a difference to me.
Can making the heap too large hurt performance? My understanding is that "young" objects get GC'd quickly, but "old" objects can take longer, so my worry is that a large heap will push some short-lived objects into the longer-lived space. I do a lot of allocation of strings that get thrown away quickly (on the order of 60 GB over the course of a single run) and so I don't want to increase GC time spent on those.
I'm testing on a machine with 8 gb of RAM, so I've been running my code with -Xms4g -Xmx4g
, and as of my last profiled run, I spent about 20% of my runtime doing garbage collection. I found that increasing the heap to 5 gb helped reduce it. The production server will have 32 gb of RAM, and much higher memory requirements.
Can I safely run it with -Xms31g -Xmx31g
, or might that end up hurting performance?