How would I measure the amount of RAM needed per Glassfish domain?

Question

In our test environment we have a lot of apps spread out over a few servers and Glassfish domains. To make versioning easier I would have liked to have one Glassfish domain per customer per app (kind of like a heavyweight version of lots of jetty instances).

I have heard that Glassfish is resource-intensive, so I would like to measure approximately how many instances would fit in the available RAM. I know I can do this by starting the instances and watching top output, but which specific statistics should I be concentrating on to get a good measure of resource consumption per instance?

Spin up a few instances and see how much RAM they use. Multiply accordingly and add a safety factor (5-10%). — voretaq7, Jun 05 '12 at 16:33

score 4 · Accepted Answer · answered Jun 06 '12 at 16:11

Using top to determine memory requirements is more an art than an exact science. There are two primary ways to go about it.

In both cases you want to take a baseline of the system's resource usage before you start the program you're investigating (in your case GlassFish). Then you follow one of two paths:

Aggregate Memory Usage Path

This is the way I personally do it because I feel it gives a better picture of overall resource utilization.
Also if you mess up you'll probably wind up with a bigger number rather than a smaller one.

Run top in a terminal somewhere and note the header output.
Pay special attention to Active and Wired memory:

last pid: 26611;  load averages:  0.50,  0.38,  0.34   up 42+18:51:53  11:44:41
34 processes:  1 running, 33 sleeping
CPU:  0.9% user,  0.0% nice,  0.0% system,  0.0% interrupt, 99.1% idle
Mem: 447M Active, 112M Inact, 233M Wired, 22M Cache, 111M Buf, 170M Free
Swap: 2048M Total, 220K Used, 2048M Free

Start your target application, and note the change in the header.

last pid: 26571;  load averages:  0.35,  0.35,  0.33   up 42+18:49:00  11:41:48
34 processes:  1 running, 33 sleeping
CPU:  2.7% user,  0.0% nice,  1.2% system,  0.0% interrupt, 96.1% idle
Mem: 606M Active, 109M Inact, 235M Wired, 22M Cache, 111M Buf, 12M Free
Swap: 2048M Total, 224K Used, 2048M Free

Calculate the change in used (Active + wired) and free (Free) memory.
In this case, used memory went up by 161MB ((447+233)-(606+235)), and free memory decreased by 158MB.
(All other things being equal these numbers should be equal or very close, the difference being made up by changes in the other fields like Inactive memory or Buffer space).

The most pessimistic of the numbers above is a good number to use for a guess at RAM utilization.

Repeat the above for additional instances of the program you're investigating and plot the change on a graph to determine the trend/curve.

Individual Instance Memory Usage Path

This method examines the individual process(es) associated with the program you're investigating.
Proceed as above, except rather than looking at the top output's header look at the individual processes that are spawned when you launch the program (I'll use Postgres as my example):

  PID USERNAME     THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 4883 pgsql          1  58    0   181M   146M sbwait  0  24.3H  6.59% postgres
 5842 pgsql          1  44    0   149M   119M sbwait  1 376:03  0.00% postgres
 2051 pgsql          1  44    0 62944K 34836K select  1  21:39  0.00% postgres
 2054 pgsql          1  44    0 31672K  4220K select  1   6:31  0.00% postgres
 2052 pgsql          1  44    0 62944K  5368K select  0   6:00  0.00% postgres
 2053 pgsql          1  44    0 62944K  5484K select  1   1:11  0.00% postgres
 4884 pgsql          1  44    0 62944K  9144K sbwait  1   1:00  0.00% postgres
 1902 pgsql          1  44    0 62944K  5348K select  1   0:46  0.00% postgres

Total up the resident (RES) size for each process associated with your application and note that as the RAM used. (the difference between resident size and the virtual size (VSIZE, or just SIZE).

There are some caveats with this method:

Size Inflation or Deflation
RESident size doesn't tell the whole story: Shared libraries get loaded, and these aren't counted in resident size, so if you total up resident size as I said above your number will be below the actual utilization.
Shared libraries ARE counted in virtual size, (VIRT or just plain SIZE), but they get counted against every program using them, so if you total up virtual size your number will be above the actual utilization -- often significantly above.
Some versions of top also split out Swap separately -- If your program has a memory leak (or a lot of data that goes stale and gets swapped out) this can also skew your figures.
Missing Processes
If you don't count all the processes associated that are spawned as a result of starting your program your total RAM usage figure will be lower than actual.

Finally there is one caveat that applies to both of these methods: Dynamic applications will mess with your numbers.
By that I mean that a program like Postgres, or Apache that either spawns new threads or new child processes to service user requests will not give an accurate picture under static conditions: You need to put a working load on the program so you can measure the impact of workload on the system (this is another reason I find the aggregate path easier: You don't have to hunt down all the children, you can just read the header as you ramp up the workload).

Ideally you will put a very heavy load on the server during testing to ensure that normal production loads won't make it start thrashing its swap space. Also once you've done your sizing you should always add some cushion RAM (I use 10% over and above the worst case operating results) and ensure that your system has enough RAM to handle that as a peak load condition.
Remember: RAM is cheap and downtime is expensive. You can always throw more RAM at the problem during server build-out and the marginal cost will likely be lower than having to shut down a system to add RAM later.

_{Note that all my top output is from FreeBSD -- Your specific labeling may differ. Consult the man page for top on your system to figure out the appropriate fields.}

How would I measure the amount of RAM needed per Glassfish domain?

1 Answers1

Aggregate Memory Usage Path

Individual Instance Memory Usage Path