You have two problems here: discovering how much memory is in use, and managing a cache. I'm not sure that the two are really closely related, although they may be.
Discovering how much memory an object uses isn't extremely difficult: one excellent article you can use for reference is "Sizeof for Java" from JavaWorld. It escapes the whole garbage collection fiasco, which has a ton of holes in it (it's slow, it doesn't count the object but the heap - meaning that other objects factor into your results that you may not want, etc.)
Managing the time to initialize the cache is another problem. I work for a company that has a data grid as a product, and thus I'm biased; be aware.
One option is not using a cache at all, but using a data grid. I work for GigaSpaces Technologies, and I feel ours is the best; we can load data from a database on startup, and hold your data there as a distributed, transactional data store in memory (so your greatest cost is network access.) We have a community edition as well as full-featured platforms, depending on your need and budget. (The community edition is free.) We support various protocols, including JDBC, JPA, JMS, Memcached, a Map API (similar to JCache), and a native API.
Other similar options include Coherence, which is itself a data grid, and Terracotta DSO, which can distribute an object graph on a JVM heap.
You can also look at the cache projects themselves: Two include Ehcache and OSCache. (Again: bias. I was one of the people who started OpenSymphony, so I've a soft spot for OSCache.) In your case, what would happen is not a preload of cache - note that I don't know your application, so I'm guessing and might be wrong - but a cache on demand. When you acquire data, you'd check the cache for data first and fetch from the DB only if the data is not in cache, and load the cache on read.
Of course, you can also look at memcached, although I obviously prefer my employer's offering here.