0

As the title states: are there any downsides to recycling resources (like large arrays) in the finalizer of the containing object? So far it works out fine but since finalizers can be a bit funky and difficult to debug I decided to ask here.

The use case is a polygon class which just represents a list of points. My application makes heavy use of large polygons with a few thousand points - allocating those arrays often isn't exactly cheap. A dispose pattern is unfortunately out of the question because they get passed around and pretty much only the GC knows when no other object is referencing them.

This is how I implemented it (arrays always have the length of 2^sizeLog2 and only arrays with sizeLog2 >= MIN_SIZE_LOG_2 get recycled):

Constructor:

public Polygon(int capacity = 1)
{
  //fast method for getting the ceiling of the logarithm of an integer
  int sizeLog2 = UIM.CeilingLog2((uint)capacity); 
  //supress finalize when the array should not be recycled
  if(sizeLog2 < TWO_POW_MIN_SIZE) GC.SuppressFinalize(this);
  Points = Create(sizeLog2);
}

Create array of size 2^sizeLog2:

private static Pnt[] Create(int sizeLog2)
{
  if (sizeLog2 >= TWO_POW_MIN_SIZE)
  {
    if (/*try to get an item from recycle queue*/) return result;
    Pnt[] points = new Pnt[1 << sizeLog2];
    //keep array alive so that it won't get collected by GC when the polygon is
    GC.KeepAlive(points);
    return points;
  }
  return new Pnt[1 << sizeLog2];
}

In the increase capacity method this is the line for reregistering the polygon for finalize, if the finalize was suppressed up to this point (capacity can only get increased and only by a factor 2):

if (newSizeLog2 == MIN_SIZE_LOG_2) GC.ReRegisterForFinalize(this);

And this is the destructor:

~Polygon()
{
  if (Points != null) Recycle(Points); //enqueues the array in a ConcurrentQueue
  Points = null;
}

And yes, I do know that this isn't exactly a fancy programming style, but this is rather performance critical so I really can't just let the GC do all the work because then I end up with hundreds of MB on the large object heap in a few seconds.

Also, the polygons get handled by different threads concurrently.

  • 2
    `GC.KeepAlive` doesn't do [what you think it does](https://referencesource.microsoft.com/#mscorlib/system/gc.cs,279), for starters. – Damien_The_Unbeliever Oct 05 '20 at 12:39
  • Not really sure what you're trying to achieve here - what does `Recycle` do? Why `null`ing the `Points` afterwards? Looks like you are trying to re-invent `ArrayPool`. – Konrad Kokosa Oct 05 '20 at 12:41
  • @Damien_The_Unbeliever Ok thank you I kind of missed that, apparently the reason this works nevertheless is that the array gets resurrected in the finalizer. – Niklas Hauber Oct 07 '20 at 11:44
  • @KonradKokosa Yes recycle does pretty much what an ArrayPool does, to be exact: `RecycleQueue[UIM.BitPosition(points.Length) - MIN_SIZE_LOG_2].Add(points);` So it's a one liner and I have more control over what happens exactly, that's why I didn't use ArrayPool in the first place. However, that's completely beside the point, the question asked if there are any downsides or risks I didn't think of when recycling an object in the finalizer and using the ArrayPool or not wouldn't make a difference regarding that. – Niklas Hauber Oct 07 '20 at 12:04
  • I'm not sure what you could be doing inside Recycle that would be useful in the context of a finalizer. Finalizers are designed to allow disposing external resources. As a general rule they should not access managed objects. – MikeJ Oct 08 '20 at 12:33
  • I think your question could be a valuable SO question if you added some more detail to it. For example if we had a better understanding about how Polygons are used. What does "freely used" mean in your comment below? And more the code used to manage their memory - Recycle or some subset of it. Right now it's focused too narrowly on an implementation detail of a larger more general problem. – MikeJ Oct 08 '20 at 18:00

1 Answers1

1

The short answer is yes there are. Large arrays are not the type of resources finalizers are intended to recycle. Finalizers should be used for external resources and in vanishingly rare instances application state.

I would suggest this article, and/or this one, to better understand some of the pitfalls of finalization.

The crux of the problem with what you've described is this statement: " I really can't just let the GC do all the work because then I end up with hundreds of MB on the large object heap in a few seconds."

But the finalizer is never going to be called until the GC "has done all the work" so you must not have a full understanding what is causing the memory pressure for your application.

It really sounds like you've got in issue with the overall design of your application where you're finding it hard to reason about who owns Polygons and/or Pnts. The result is there are references to polygons/Pnts - somewhere - when you don't expect to have them. Use a good profiler to find where this is happening and fix this overall design problem instead trying to use finalization.

MikeJ
  • 1,299
  • 7
  • 10
  • I realize that this isn't what finalizers are supposed to be used for and the IDisposable interface would solve this problem, but with the downside that the user code handling the polygons would have to handle disposing the object, which would be perfectly fine if I was programming in C++, but C# has a GC and my application is designed to make use of it. Also the GC has not done all the work, there are huge costs for allocating many objects in the LOH which is exactly what recycling prevents. https://learn.microsoft.com/en-us/dotnet/standard/garbage-collection/large-object-heap – Niklas Hauber Oct 07 '20 at 11:42
  • @NiklasHauber IDisposable would likely not solve this problem either. I would suggest looking at ArrayPool or MemoryPool. It sounds like what you really want is a way to re-use large blocks of Pnts. – MikeJ Oct 07 '20 at 12:19
  • I'm already doing the recycling myself, similar to the ArrayPool, but a very stripped down version, this is what the Recycle method does. I didn't include the implementation but the comment next to the call explains what it does. After a bit of googling I even found a term for what I'm doing exactly, which is object resurrection. So it's not exactly pretty programming, but it seems like it's safe to do like this, and also convenient. IDisposable would only solve the problem of having nondeterministic finalization, but this isn't a requirement in this case. – Niklas Hauber Oct 07 '20 at 13:03
  • @NiklasHauber my point about IDisposable is that while it's certainly useful in adding deterministic cleanup the larger problem of managing the large objects that you're using is a constraint that requires some higher level design work to solve. Whoever owns polygons for example may need to be responsible for managing that memory. You don't need a finalizer or object resurrection if those responsibilities are clearly defined. – MikeJ Oct 07 '20 at 14:05
  • It can't really be defined who owns the polygons, they are supposed to be types which can freely be used by any other projects. So either I do some ugly programming like I did, accept huge performance penalties or add complexity to the code which uses these polygons, by forcing it to manually manage that memory for the sole purpose of performance gains, in a programming language designed in a way that the programmer wouldn't have to do this. I can't even get a fancy equivalent of shared_pointer in C#. – Niklas Hauber Oct 07 '20 at 14:24
  • Object resurrection IS one of the techniques, indeed (treated as a drawback, pretty often). However, if you polygons are long-living, the overhead may be negligible - finalization cost is mostly due to object allocation and premature promotion. Both are not relevant for long-living gen2/LOH objects. So, it is probably ok. Personally, I would fight for redesign the whole codebase to have a single owner of them, but it is easy said not seeing the code. Sadly, we [don't have `PhantomReferences`](https://tooslowexception.com/do-we-need-jvms-phantomreference-in-net/). – Konrad Kokosa Oct 08 '20 at 09:27
  • @NiklasHauber I'm not sure I understand how this involves object resurrection. – MikeJ Oct 08 '20 at 12:43
  • @MikeJ single owner should help to get rid of finalizers and use explicit cleanup. Like in the case of [`IMemoryOwner`](https://learn.microsoft.com/en-us/dotnet/api/system.buffers.imemoryowner-1?view=netcore-3.1). Obviously, this is just assumption, it all depends on OP use case. – Konrad Kokosa Oct 08 '20 at 15:48
  • @KonradKokosa sorry I realize my comment was not clear. I mean I don't realize how a finalizer would help in this scenario not the single owner. The single owner would solve the problem. – MikeJ Oct 08 '20 at 15:50