13

My understanding (which may be wrong) is that in c# when you create a string it gets interned into "intern pool". That keeps a reference to strings so that multiple same strings can share the operating memory.

However I am processing a lot of strings which are very likely unique, and I need to completely remove them from operating memory once I am done with each of them and I am not sure how the cached reference is going to be removed so that garbage collector can just remove all the string data from memory. How can I prevent the string from being interned in this cache, or how can I clear it / or remove a string from it so that it surely get removed from operating memory?

Petr
  • 13,747
  • 20
  • 89
  • 144
  • What's your motivation for wanting to do something other than the default behavior? – Jonathon Reinhart Apr 26 '13 at 09:45
  • 1
    Just let the GC do its work. – ken2k Apr 26 '13 at 09:46
  • 6
    Not every string gets interned, [only literal strings](http://stackoverflow.com/questions/8509035/why-only-literal-strings-saved-in-the-intern-pool-by-default). – Tim Schmelter Apr 26 '13 at 09:46
  • Well, but I guess that if the reference is kept in this intern pool, GC will not just remove the string from operating memory? Or how does it know which string can be removed and which can't be? – Petr Apr 26 '13 at 09:47
  • 6
    My understanding is that strings are interned by default only if they are compile time constants or if you use `String.Intern` on them, as http://broadcast.oreilly.com/2010/08/understanding-c-stringintern-m.html explains. – Patashu Apr 26 '13 at 09:47
  • All strings are held in memory, whether they are interned or not. Is string internment a red herring? – Jodrell Apr 26 '13 at 10:09
  • Constant strings are interned (by default). Runtime strings are not (by default). All your runtime strings (Interned or not) will be removed by the GC. For more info, check out my answer. – Martin Mulder Apr 26 '13 at 10:14

4 Answers4

7

If you need to remove the strings from memory for security reasons, use SecureString.

Otherwise, if there are no references to the string anywhere, the GC will clean it up anyway (it will no longer be interned) so you don't need to worry about interning.

And of course, only string literals are interned in the first place (or if you call String.Intern() as noted above by Petr and others).

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
  • 5
    @downvoter: Can you explain why, just to help others who read this answer? – Matthew Watson Apr 26 '13 at 09:52
  • 1
    SecureString is NOT a solution. It sounds "secure" but if I read his question correctly, he does not need to remove them from memory because he is affraid of hackers or something. He just wants his working memory back. – Martin Mulder Apr 26 '13 at 10:13
  • @MartinMulder It's not completely clear from the question, hence the use of the term "*if*" in the first line of my answer. – Matthew Watson Apr 26 '13 at 10:17
  • well, despite this is not directly answering my question I accepted it as it doesn't just explain what I needed (I don't need to care about runtime strings as they aren't interned, which is what I didn't know), but it also introduces an universal and simple way to create a string which never can be interned and is kept in memory as short as possible (and even encrypted), that is interesting thing which is definitely good to know. – Petr Apr 26 '13 at 11:38
5

Apply CompilationRelaxations attribute to the entire assembly (looks like the only possible solution is to forbid interning on an assembly level) as follows:

[assembly: CompilationRelaxations(CompilationRelaxations.NoStringInterning)]

More information on CompilationRelaxations

UPDATE:

The documentation states that the attribute:

Marks an assembly as not requiring string-literal interning.

In other words, it does not prevent the compiler from doing string interning, just providing a hint that it is not required. The documentation is a little bit sparse in this area, but this also seems to be the conclusion in this MSDN forum post.

From this SO question on that attribute

Community
  • 1
  • 1
illegal-immigrant
  • 8,089
  • 9
  • 51
  • 84
  • 2
    Marks an assembly as **not requiring** string-literal interning what is not the same as saying that the feature should be turned off http://social.msdn.microsoft.com/Forums/en-US/clr/thread/c1c1d969-8d6b-4aaf-b7f4-3febedf3cd18/ – Tim Schmelter Apr 26 '13 at 09:58
  • @TimSchmelter updated answer with link + explanation from related SO question and mentioned MSDN thread. Thanks – illegal-immigrant Apr 26 '13 at 10:03
  • CompilationRelaxations only apply to constant strings already in code, not runtime strings. Petr is talking about runtime strings. – Martin Mulder Apr 26 '13 at 10:12
2

You are saying to things:

  • You are processing a lot of strings, so you are talking about runtime values.
  • You want to remove the strings from memory after you are done processing them.

By default, runtime values are NOT interned. When you receive a string from a file or create a string yourself, they all have a separate instance. You can Intern them via String.Intern. Interning strings takes more time, but consumes less memory. See: http://msdn.microsoft.com/en-us/library/system.string.intern.aspx

Runtime strings are automatically removed by the GC if there is no reference to them. An interned will have more references, but at the end of your process, I assume that all references are removed. The interning-mechanism does not keep a HARD reference, but a WEAK reference. A weak reference is ignore by the GC, so the string instance can still be removed. See: http://msdn.microsoft.com/en-us/library/system.weakreference.aspx

So... to sum it up. By default your runtime strings are not interned. And if they would be interned, they are still removed by the GC after your work is done.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Martin Mulder
  • 12,642
  • 3
  • 25
  • 54
1

Before trying out to prevent the interning I would suggest to use String.IsInterned() to find out whether the strings you are concerned with are actually interned at all. If that function returns null, your string is not interned.

As far as I know strings that are generated dynamically at runtime are not interned at all, since there would be no performance benefits .

HugoRune
  • 13,157
  • 7
  • 69
  • 144