2

Can you please explain to me what happens in the memory while executing the following code:

Case 1:

public static void Execute()
{
    foreach(var text in DownloadTexts())
    {
          Console.WriteLine(text);
    }
}


public static IEnumerable<string> DownloadTexts()
{
     foreach(var url in _urls)
     {
         using (var webClient = new WebClient())
         {
              yield return webClient.DownloadText(url);
         }
     }
}

Let's assume after the first iteration I get html1.

When will html1 be cleared from the memory ?

  1. on the next iteration?
  2. when the foreach ends?
  3. when the function ends ?

Thanks

** Edit **

Case 2:

public static void Execute()
{
    var values = DownloadTexts();
    foreach(var text in values)
    {
          Console.WriteLine(text);
    }
}


public static IEnumerable<string> DownloadTexts()
{
     foreach(var url in _urls)
     {
         using (var webClient = new WebClient())
         {
              yield return webClient.DownloadText(url);
         }
     }
}

To my understanding, Case 1 is better for the memory then case 2 right?

In case 2 will still keep a reference to the texts we already downloaded while in Case 1 every text is marked for garbage collection once its not used. Am I correct?

Amir Yonatan
  • 717
  • 1
  • 7
  • 14
  • "*Let's assume after the first iteration I get html1.*" Can you clarify what you mean by "*get html1*"? – Scott Chamberlain Jan 02 '14 at 20:08
  • consider html1 the text downloaded and returned into the (var text) inside the foreach statement in the execute function – Amir Yonatan Jan 02 '14 at 20:08
  • 1
    In addition to "when", you may also be interested in "why": http://startbigthinksmall.wordpress.com/2008/06/09/behind-the-scenes-of-the-c-yield-keyword/ – Mister Epic Jan 02 '14 at 20:13
  • Why do you need to know this? If you need to handle memory explicitly, C# is not a good choice. – Brian Rasmussen Jan 02 '14 at 20:21
  • I'm not planning on handling the memory. Consider the next example: you're downloading 10000 results from an api, which gives you 1000 results at a time. In every result there is a field called sum, and you want to sum this field over all the results. I wanted to make sure that not all 10000 results will be loaded into the memory at the same time. – Amir Yonatan Jan 02 '14 at 20:25

3 Answers3

5
  • _urls will stay indefinitely because it is located in a field as it seems.
  • DownloadTexts() (the iterator returned by it) is kept alive until the end of the loop.
  • the WebClient and the html it produces stay alive for one iteration. If you want to know the absolute precise lifetime of it, you need to use Reflector and mentally simulate where the reference travels around. You'll find that the IEnumerator used in the loop references it until the next iteration has begun.

All objects that are not alive can be GC'ed. This happens whenever the GC thinks that is a good idea.

Regarding your Edit: The cases are equivalent. If you don't put the enumerator into a variable, the compiler will do that for you. It has to keep a reference till the end of the loop. It does not matter how many references there are. There is at least one.

Actually, the loop only requires the enumerator to be kept alive. The additional variable you added will also keep the enumerable alive. On the other hand you are not using the variable so the GC does not keep it alive.

You can test this easily:

//allocate 1TB of memory:
var items =
    Enumerable.Range(0, 1024 * 1024 * 1024)
    .Select(x => new string('x', 1024));
foreach (var _ in items) { } //constant memory usage
usr
  • 168,620
  • 35
  • 240
  • 369
  • Consider the next example: you're downloading 10000 results from an api, which gives you 1000 results at a time. In every result there is a field called sum, and you want to sum this field over all the results. So every iteration 1000 items are loaded into the memory and after the sum is done over them, they are marked to be cleared, is that correct ? – Amir Yonatan Jan 02 '14 at 20:29
  • After each iteration all previous items are released. Objects are only kept alive by references. Where would there be a reference to those previous items? There is no place it can be. Foreach over `Enumerable.Range(0, int.MaxValue-1).Select(x => new string('x', 1024))` to see what I mean. This allocates 2TB of memory, yet it runs fine. – usr Jan 02 '14 at 20:32
  • 1
    Awesome that really makes a lot of sense now. Thanks a lot! – Amir Yonatan Jan 02 '14 at 20:35
0

It will be cleared from memory when the garbage collector runs and determines that it's no longer in use.

The value will no longer be in use at the moment when the foreach causes the IEnumerator.MoveNext() method to be invoked. So, effectively, #1.

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
0

It will be cleared from memory when Garbage Collector will feel like doing so.

But starting point is when code holds no more references to instance of object. So the answer in this case is: sometime after block in which you created object ends.

Have trust in GC, it is good at doing its job.

PTwr
  • 1,225
  • 1
  • 11
  • 16