4

How can i tell how much space a pre-sized HashMap takes up before any elements are added? For example how do i determine how much memory the following takes up?

HashMap<String, Object> map = new HashMap<String, Object>(1000000);
Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
richs
  • 4,699
  • 10
  • 43
  • 56
  • Why, dont you require while creating a map? – ring bearer Feb 28 '11 at 21:47
  • 1
    @Travis: Whoa there, chill! I didn't prevent him from doing anything, just asked him what he's trying to do so we might be able to better help him. – user541686 Feb 28 '11 at 22:12
  • Sorry mistake in code. I'm testing performance when managing large amounts of data with different collections/maps. quick insert quick retrieval based on a excellent hash algorithms and no re-hashing. – richs Feb 28 '11 at 23:14
  • hashmap is no current structure, nothing big shall be managed by a single core nowadays; to your question - 1048576*8+ per entry 48. Keep in mind that HashMap 'scrambles' the bits and uses pow2 table (no prime), so the hash function may not be excellent. Allocation on put is a high cost to pay overall, both footprint/performance (on get as well due to extra indirection) – bestsss Mar 01 '11 at 01:04

8 Answers8

4

In principle, you can:

  • calculate it by theory:
    • look at the implementation of HashMap to figure out what this method does.
    • look at the implementation of the VM to know how much space the individual created objects take.
  • measure it somehow.

Most of the other answers are about the second way, so I'll look at the first one (in OpenJDK source, 1.6.0_20).

The constructor uses a capacity that is the next power of two >= your initialCapacity parameter, thus 1048576 = 2^20 in our case. It then creates an new Entry[capacity] and assigns it to the table variable. (Additionally it assigns some primitive variables).

So, we now have one quite small HashMap object (it contains only 3 ints, one float and one reference variable), and one quite big Entry[] object. This array needs space for their array elements (which are normal reference variables) and some metadata (size, class).

So, it comes down to how big a reference variable is. This depends on VM implementation - usually in 32-bit VMs it is 32 bit (= 4 bytes), in 64-bit VMs 64 bit (= 8 bytes).

So, basically on 32-bit VMs your array takes 4 MB, on 64-bit VMs it takes 8 MB, plus some tiny administration data.


If you then fill your HashTable with mappings, each mapping corresponds to a Entry object. This entry object consists of one int and three references, taking about 24 bytes on 32-bit VMs, maybe the double on 64-bit VMs. Thus your 1000000-mappings HashMap (assuming an load factor > 1) would take ~28 MB on 32-bit-VMs and ~56 MB on 64-bit VMs.

Additionally to the key and value objects themselves, of course.

Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
2

You could check memory usage before and after creation of the variable. For example:

long preMemUsage = Runtime.getRuntime().totalMemory() -
      Runtime.getRuntime().freeMemory();
HashMap<String> map = new HashMap<String>(1000000);
long postMemUsage = Runtime.getRuntime().totalMemory() -
      Runtime.getRuntime().freeMemory();
dmcnelis
  • 2,913
  • 1
  • 19
  • 28
2

The exact answer will depend on the version of Java you are using, the JVM vendor and the target platform, and is best determined by direct measurement, as described in other answers.

But as a simple estimate, the size is likely to be either ~4 * 2^20 or ~8 * 2^20 bytes, for a 32 bit or 64 bit jvm respectively.

Reasoning:

  • The Sun Java 1.6 implementation of HashMap has a fixed side top-level object and a table field that points to the array of references to hash chains.

  • In a newly created (empty) HashMap the references are all null and the array size is the next power of two larger that the supplied initialCapacity. (Yes ... I checked the source code.)

  • A reference occupies 4 bytes on a typical 32bit JVM and 8 bytes on a typical 64 bit JVM. Some 64 bit JVMs support compact references ("compressed oops"), but you need to set JVM options to enable this.

  • The top object has 5 fields including the table array reference, but this is a relatively small constant overhead.

  • The top object and the array have object header overheads, but these are constant and relatively small.

Thus the size of the table array dominates, and it is 2^20 (the next power of 2 greater than 1,000,000) multiplied by the size of a reference.


So, this tells you that setting a large initial capacity really does use a lot of memory. On the other hand, if the initial capacity is a good estimate of the map's capacity when fully populated, you will save significant amounts of time by setting it. (This avoids a number of cycles of reallocating the array and rebuilding of the hash chains.)

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
1

You could probably use a profiler like VisualVM and track memory use.

Have a look at this too: http://www.velocityreviews.com/forums/t148009-java-hashmap-size.html

Argote
  • 2,155
  • 1
  • 15
  • 20
1

I'd have a look at this article: http://www.javaworld.com/javaworld/javatips/jw-javatip130.html

In short, java does not have a C-style sizeof operator. You could use profiling tools, but IMO the above link gives the simplest solution.

Another piece of info that may be helpful: an empty java String consumes 40 bytes. One million of them would probably be at least 40MB...

Travis Webb
  • 14,688
  • 7
  • 55
  • 109
  • also, i believe that hashmaps allocate an internal table greater than the requested size. I believe it does this by finding the smallest power of 2 greater than your requested size. – MeBigFatGuy Mar 01 '11 at 00:00
0

In the latest version of Java 1.7 (I'm looking at 1.7.0_55) HashMap actually lazily instantiates its internal table. It's only instantiated when put() is called - see the private method "inflateTable()". So your HashMap, before you add anything to it at least, will occupy only the handful of bytes of object overhead and instance fields.

0

You should be able to use VisualVM (comes with JDK 6 or can be downloaded) to create a memory snapshot and inspect the allocated objects for their size.

Brent Worden
  • 10,624
  • 7
  • 52
  • 57
0

I agree that a profiler is really the only way to tell. The other bit of relevant information is whether you're using a 32-bit or 64-bit JVM. The amount of overhead due to memory references (pointers) varies depending on that and whether you have compressed oops turned on. I've found that for smaller data sets the overhead of objects and pointers is significant.

Jeremy
  • 607
  • 4
  • 6