1

I have a map of format Map stored in a file. This file has over 100,000 records.

The value of each entry is nearly 10k.

I load 1000 records into a map in memory , process them ,then clear the map and load the next 1000 records.

My question is :

  1. Since the strings are stored in String pool which is in permgen memory area , when i clear the map will the Strings be garbage collected ?

  2. Incase if they are not garbage collected is there any way to force them to be garbage collected?

  3. Is there any guarantee that if the program is running out of memory , JVM would clean the permGen memory before throwing OutOfMemory Exception ?

TheLostMind
  • 35,966
  • 12
  • 68
  • 104
sujith
  • 665
  • 2
  • 9
  • 22
  • It seems that you are able to load process arbitrary chunks of "records" into memory ... so I wondering about two thinks: why is it important that stuff is stored in a map; the other thing is: if you are concerned about memory usage; why don't you got with smaller chunks? Or in reverse: have you done some profiling and found that going with chunks of 1000 results in optimal performance? – GhostCat Aug 27 '15 at 10:45
  • @Jagermeister : The entries are around 20,000 initially and they are loaded completely into a map and processed. But going forward the entries turned out to be more than 100,000 . To process these many ,I was receiving Out of Memory exception. So i profiled to see what was occupying too much of space. More than 50% of space was occupied by Char[] . So thought of reading the entries in batches . 1000 seemed to be a good number when it boiled down to speed vs memory tradeoff. – sujith Aug 27 '15 at 10:49
  • @sujith - Which version of java are you using? – TheLostMind Aug 27 '15 at 10:53
  • @TheLostMind : I am using Java 8. – sujith Aug 27 '15 at 10:55
  • 1
    @sujith - Try running your code with : `-Xmx1024m -XX:+UseG1GC -XX:+UseStringDeduplication` . Hopefully you will not get GC – TheLostMind Aug 27 '15 at 10:57

1 Answers1

4

Ok.. Let's start....

Since the strings are stored in String pool which is in permgen memory area , when i clear the map will the Strings be garbage collected ?

All strings are NOT stored in String constants pool. Only interned Strings and String literals go into the String constants pool. There is no concept of permgen in java-8. Metaspace has (almost gracefully) replaced Permgen.

If you have Strings read from a file (which are not interned), then yes your strings will get GCed. If you have String literals (and God save you if you do.. :P), the they will be GCed when the classloader which loaded your class which defined these string literals gets GCed.

Incase if they are not garbage collected is there any way to force them to be garbage collected?

Well, You could always call System.gc() explicitly (NOT a good idea in production environment). If you are using java-8 use G1Gc and enable String deduplication.

Is there any guarantee that if the program is running out of memory , JVM would clean the permGen memory before throwing OutOfMemory Exception

The GC will try its best to cleanup as much as it can. No, there is no guarantee that this would happen.

TheLostMind
  • 35,966
  • 12
  • 68
  • 104
  • I did not understand this part " How can the strings inside a file be interned or not interned ? As long as the content is not loaded into memory , how can we differentiate the Strings in a file as interned or not ? Also , Even if the String is created using String s = new String("xyz") , a String object is created and the "xyz" is placed in permgen if it is not already present in permgen. So in this case , if the GC runs the String object in the heap gets garbage collected but the entry made in the string pool would still exist . Is it wrong? – sujith Aug 27 '15 at 10:54
  • @sujith - If you are reading strings from a file, then they will not be interned. Yes, you are right if GC runs, then it probably won't collect the string in constants pool – TheLostMind Aug 27 '15 at 10:56
  • ,Thank you for the information about java 8 feature. I would try this solution . I have one more question . Assume that i have 2 strings String one = "hello world" String two = "hello" String three = "hell" Now when these are interned , will there be 3 Strings created in the String pool or since the contents of String two and String three are a substring of String one , will there be only one Char[] array with different offsets for the three strings in the string pool ? – sujith Aug 27 '15 at 11:09
  • @sujith - They will all have different `char[]` – TheLostMind Aug 27 '15 at 11:22