4

I'm currently implementing my garbage collector (in C++) using reference counting technique. However, there's a major problem is that if the data is circularly referenced, they're never collected since their reference counts are always non-zero.

I tried searching around and found these things called tracing garbage collector, mark-and-sweep algorithm, etc. Is it possible for me to implement one? And how exactly do they work?

Charles
  • 50,943
  • 13
  • 104
  • 142
IcySnow
  • 851
  • 2
  • 14
  • 23
  • 2
    Look into weak references. Also, try to avoid circular references altogether. – Cat Plus Plus Dec 19 '11 at 17:25
  • This isn't a very well formed question. Of course you can implement one, and you should look into a good programming languages book to understand how those garbage collection algorithms work. – Michael Price Dec 19 '11 at 17:29
  • Agreed with CPP: If you think about it, there can never be a truly symmetric circular reference. Someone always has to come first. So the final edge in the "circle" should be a "weak reference", which solves the problem. – Kerrek SB Dec 19 '11 at 17:36
  • @Michael Price: Most resources I've found so far just go on and on explaining about various terms, and none of them actually bothered to give some examples. I really don't care how the OS, or the compiler, or the Java language collect the garbage, I care about how I can do it myself. And so far, no luck. Reference counting does not accurately collect all the garbage, as mentioned in the question. – IcySnow Dec 19 '11 at 17:36
  • 4
    @Cat: "Also, try to avoid circular references altogether": It's not the business of a garbage collector to impose restrictions on the programmer! – TonyK Dec 19 '11 at 17:37
  • I'm sure it's possible to implement it but im not sure if you really want to :) – ScarletAmaranth Dec 19 '11 at 17:42
  • @TonyK: while it may not be desirable, ALL garbage collectors place restrictions on the programmer, so when writing a program using a garbage collector you need to be aware of its foibles and restrictions. – Chris Dodd Dec 19 '11 at 19:21
  • @Chris: Then Cat Plus Plus should have said: "Also, inform your users that they should try to avoid circular references altogether." Which means something completely different. – TonyK Dec 19 '11 at 19:23
  • @TonyK: Eh? That reference counting cannot deal with circular references very well is a well-known limitation. I was talking about the application developer that introduces circular references in their code. – Cat Plus Plus Dec 19 '11 at 19:52
  • @CatPlusPlus: There are many problems that require circular references between objects. Imposing that restriction because of an implementation detail (garbage collection can be implemented otherwise) is ridiculous. – André Caron Dec 19 '11 at 20:12

2 Answers2

3

This is a classic problem in garbage collector design. Take a look at the Garbage Collection article on Wikipedia, it's really good in presenting the different trade-offs in garbage collector design. The "more evolved" algorithms like tri-color marking are actually quite simple and easy to implement. I've used that those instructions to implement a tracing collector for my own Lisp implementation in C.

The most complex thing to handle in tracing garbage collectors is walking object trees (e.g. finding references to "live" objects). If you are writing an interpreter for another language, this is not too hard because you can wire in facilities for this in your root object class (or other common denominator to all objects). However, if you're writing a garbage collector for C++ in C++, then you'll have a hard time doing this because you need to inspect object contents to find pointers to other allocated regions of memory.

If you are writing a garbage collector for educational purposes, I recommend that you look into writing an interpreter for another language (one that does not have direct access to pointers). If you're writing a collector for C++ in C++ with the intent of using it in production software, I strongly recommend that you use an existing production-quality implementation instead.

André Caron
  • 44,541
  • 12
  • 67
  • 125
1

http://www.boost.org/doc/libs/1_48_0/libs/smart_ptr/weak_ptr.htm

"The weak_ptr class template stores a "weak reference" to an object that's already managed by a shared_ptr. To access the object, a weak_ptr can be converted to a shared_ptr using the shared_ptr constructor or the member function lock. When the last shared_ptr to the object goes away and the object is deleted, the attempt to obtain a shared_ptr from the weak_ptr instances that refer to the deleted object will fail: the constructor will throw an exception of type boost::bad_weak_ptr, and weak_ptr::lock will return an empty shared_ptr."

You shouldn't have circular references really, but if you're working with a design where you can't refactor them out (which does happen occasionally), try placing weak pointers in one of the directions so they don't prevent the destruction.

John Humphreys
  • 37,047
  • 37
  • 155
  • 255
  • This is good advice if you're managing resources without a garbage collector, but doesn't help the OP to implement a garbage collector. – Mike Seymour Dec 19 '11 at 17:40
  • +1 - This answer is useful as an easy way to solve the problem behind the literal question. – Andy Thomas Dec 19 '11 at 17:46
  • Yes, I'm trying to solve the problem I'm having with my current GC. Because if the user uses it to control something like a circularly linked list, then it fails. – IcySnow Dec 19 '11 at 17:47
  • -1 To offset the +1: this is rarely a workable solution, and cycles are implicit in many data structures (graphs, etc.). – James Kanze Dec 19 '11 at 18:43
  • @user1065635 - Are (or your team) the user, or do you have external users? – Andy Thomas Dec 19 '11 at 22:44