4

I have an application (game) that runs on JVM.

The game's update logic (which runs 60 times/s) finishes with about 25% used of it's "time-slice" (1/60s), then sleeps off the remaining 75%. But when the GC collector gets to run, it goes up to 75-200% and stays there for the rest of the execution.

the game uses about 70Mb of heap and grows about 1-2mb/s. When GC is run it goes back to 70Mb, so there are no true memory leaks. I will try to lower this number in the future, but it shouldn't be a problem in this scope.

I'm using JVM 8 with no runtime arguments or flags, not sure which GC that will give me.

I've tried setting the heap to different sizes, but it does not affect this phenomena.

I have two theories as to why this may be:

  1. the GC unintentionally fragments my heap in a way that causes cache trashing in the update loop. I've got logic that benefits greatly from data proximity as it loops through it and updates it. Could it be that it shuffles some data to the old area while keeping some in the young (the nursery)?

  2. the sudden GC processing triggers my OS, making it realize that my main update tread doesn't need as much CPU resources as it currently has, lowering its priority. (However, the phenomenon persists even if I skip the thread.sleep() to sleep off unused CPU usage.

What do you think. Are my theories plausible, can anything be done about them, or do I need to switch to a C-language? My knowledge of GC's is limited.

P.S. As a side note, generally the update() finishes at 75% post GC. It's when using VSync when I get numbers like 200%.

Jake
  • 843
  • 1
  • 7
  • 18
  • maybe compare cache misses and related stats with `perf record`? also, which JVM? which collector? which command line flags? – the8472 Nov 18 '16 at 08:18
  • Thanks, I added some information to the post. Not sure what you mean with "pref record" though.... – Jake Nov 18 '16 at 08:31
  • https://perf.wiki.kernel.org/index.php/Tutorial#Sampling_with_perf_record and you should enable GC logging to get information on pause times (consult the JVM documentation for how to do that) – the8472 Nov 18 '16 at 08:35
  • You still did not state which JVM you're using, So I'll assume you don't know that there are others besides oracle's. Take a look at the [GC tuning guide](https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/), you probably want to try CMS. – the8472 Nov 18 '16 at 08:46
  • Thanks, I'll look in to it. – Jake Nov 18 '16 at 09:26

2 Answers2

1

NO LONGER VALID:

I did a test and completely ruined my architecture. I had this, which was the bottleneck of the application:

class Physics{
   Vec2 centre;
   Rec hitbox;
   Vec2 speed;
   Vec2 acc;
   ...

   public void update(){ //critical method
       centre.doThings();
       hitbox.doThings();
       etc...
   }

}

And changed it to only use primitives:

class Physics{
   double centreX,centreY;
   double x1,x2,y1,y2;
   double speedX,speedY;
   double accX,accY;
   ...

   public void update(){ //critical method
       implementation of methods above...
       etc...
   }

}

This since, at least, java guarantees that a class primitive members gets stored in the order they are declared, underneath the class header on the heap. While references to objects could be addresses to the other side of the heap.

This, along with a compacting GC, gave me a lovely boost, which I credit to an increase in cache hits. It destroyed my architecture, but it's a price I'm willing to pay.

The game now runs at a stable 15%, and now I'm going to mark my own post as the answer.

EDIT: That was just a confused man's ramblings. The above only gave me a minor performance boost - the rest was due to a bug in the application and thus not justifying the architectural change. The compacting GC helped a bit though.

Jake
  • 843
  • 1
  • 7
  • 18
0

What do you think. Are my theories plausible,

The first theory is plausible, but the second one is not.

can anything be done about them

You could possibly improve things by:

  1. Increasing the max heap size.
  2. Switching to a low-pause collector.
  3. Performance optimization based on the results of profiling the application.
  4. Trying to reduce the rate of garbage generation.

or do I need to switch to C or C++?

C and C++ would give you more predictable behavior because there is nothing that will be moving objects around. If you have the appropriate skills and put in the effort, you should be able to get better performance in C and C++, especially when doing graphics / rendering. However, those are big "if"s.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thanks Steven, 1 .I've tried increasing the heap size, but to no avail. sure, it prolongs the GC's first arrival, but its effects remain the same. – Jake Nov 18 '16 at 11:06
  • 3. I'm not sure I get your meaning – Jake Nov 18 '16 at 11:09
  • 4. Indeed, but even if I do, eventually a GC will have to run and then I'm guessing the phenomenon will reappear. It feels like just prolonging the inevitable. I want to solve the issue. And if I'm to completely remove all temporary heap allocation, then I might as well move to C - that's my thoughts – Jake Nov 18 '16 at 11:12