How i can know how much memory my cached objects are using?

Question

We are trying to cache the results of database selects (in hash map), so we wouldn’t have to execute them multiple times. and whenever we are changing data base, so for getting the changes in app we have added refresh list functionality.

Now we have a large no of list to fetch, so it taking too much time to load pick list from the data base.

So I have some question regarding this issue:

How I can find out how much memory the list is using? (I have used the method where we are using garbage collector for collecting the memory and taking the difference but there are many list and so it is taking too much time)
How I can optimize the refresh list?

Thanks for the help.

jconsole or some profiling tool – anfy2002us Jun 01 '11 at 09:04 — anfy2002us, Jun 01 '11 at 09:04

score 2 · Answer 1 · edited May 23 '17 at 12:26

how i can find how much memory the list is using

how i can optimize the refresh list.

Make sure you're using the correct collection type for your data. Have a look here.

Also have a look at the Guava collections.

One last thing, ignis is very right by advising you not to use System.gc() this might be the very reason you're having performance problems. This is why.

score 2 · Answer 2 · answered Jun 01 '11 at 09:44

First, while not wanting to generalize when it comes to performance problems, the issue you're seeing are unlikely to be purely down to memory use, though if the lists are large this could come into play when they're refreshed and a large number of objects become eligible for collection.

To solve issues relating to garbage collection there's a few rules of thumb, but it always comes down to breaking out a profiler an tuning the garbage collector - there's more on that here.

But before that any loading of a database is going to involve iteration over a result set, so the biggest optimization you can make will be to reduce the size of the result sets. There's a couple of ways to do that:

if you using a map, try to use keys that don't require loading and do the load when you get a miss.
once loaded, only refresh the rows that have changed since you last loaded the data, though this obivously doesn't solve the start-up problem.

Now all that said, I would not recommend you write your own caching code in the first place. The reasons I say this are:

all modern RDBMS cache, so providing your queries are performant getting the actual result set should not be a bottleneck.
Hibernate provides not only ORM but a robust and well understood caching solution.
if you really need to cache massive datasets, use Coherence or similar - the cache can be started in a seperate JVM and your application doesn't need to take the load hit.

score 1 · Answer 3 · answered Jun 01 '11 at 09:29

You have two problems here: discovering how much memory is in use, and managing a cache. I'm not sure that the two are really closely related, although they may be.

Discovering how much memory an object uses isn't extremely difficult: one excellent article you can use for reference is "Sizeof for Java" from JavaWorld. It escapes the whole garbage collection fiasco, which has a ton of holes in it (it's slow, it doesn't count the object but the heap - meaning that other objects factor into your results that you may not want, etc.)

Managing the time to initialize the cache is another problem. I work for a company that has a data grid as a product, and thus I'm biased; be aware.

One option is not using a cache at all, but using a data grid. I work for GigaSpaces Technologies, and I feel ours is the best; we can load data from a database on startup, and hold your data there as a distributed, transactional data store in memory (so your greatest cost is network access.) We have a community edition as well as full-featured platforms, depending on your need and budget. (The community edition is free.) We support various protocols, including JDBC, JPA, JMS, Memcached, a Map API (similar to JCache), and a native API.

Other similar options include Coherence, which is itself a data grid, and Terracotta DSO, which can distribute an object graph on a JVM heap.

You can also look at the cache projects themselves: Two include Ehcache and OSCache. (Again: bias. I was one of the people who started OpenSymphony, so I've a soft spot for OSCache.) In your case, what would happen is not a preload of cache - note that I don't know your application, so I'm guessing and might be wrong - but a cache on demand. When you acquire data, you'd check the cache for data first and fetch from the DB only if the data is not in cache, and load the cache on read.

Of course, you can also look at memcached, although I obviously prefer my employer's offering here.

score 0 · Answer 4 · answered Jun 01 '11 at 09:09

0

Be aware that invoking

System.gc()

or

Runtime.getRuntime().gc()

is a bad idea unless you really need to do that. You should leave the VM the task of deciding when to free objects, unless after profiling you found that it's the only way to make the application go faster on your client's VM.

answered Jun 01 '11 at 09:09

ignis

8,692
2
23
20

So what you're saying is it's a bad idea, unless it isn't? (-; – Adriaan Koster Jun 01 '11 at 09:27
There's plenty of articles on the Internet saying that. It's a bad idea unless you have proven you really need that in your specific application and on a specific VM. – ignis Jun 01 '11 at 09:39
hi, i am using garbage collector just for checking how much memory my code list is using. so we can take appropriate action for the memory problem. – harry Jun 01 '11 at 10:10

score 0 · Answer 5 · answered Jun 01 '11 at 09:27

0

I tend to use YourKit for this sort of thing. It costs money but IMO is worth every penny (no connection other than as a customer).

answered Jun 01 '11 at 09:27

NPE

486,780
108
951
1,012

How i can know how much memory my cached objects are using?

5 Answers5