2

Before smart pointers (capable of taking ownership of resources in the dynamic region and freeing them after use) came into being, I wonder how bookkeeping on dynamically created objects was performed when passed as arguments to functions that took resource pointers.

By bookkeeping, I mean that if there is a "new" then at some point later there should be a "delete" following it. Otherwise, the program will suffer from a memory leak.

Here is an example with B being a class and void a_function(B*) being a third party library function:

void main() {

  B* b = new B(); // line1

  a_function(b);  // line2

  ???             // line3
}

What do I do in line 3? Do I assume that the third party function has taken care of de-allocating the memory? If it has not and I assume that it has, then my program suffers from a memory leak. But, if it de-allocates the memory occupied by b and I too do it in main() so as to be on the safe side, then b actually ends up being freed twice! My program will crash due to a double-free error!

softwarelover
  • 1,009
  • 1
  • 10
  • 22
  • The only answer to your question is to be very very careful, and experience has taught that this is too much to ask even from good programmers. Smart pointers have existed at least as long as C++. The design of C++ (in the overload of operator->) is clearly designed to support smart pointers. – john Sep 08 '12 at 18:21
  • @Martinho: By that do you mean auto_ptr objects? But it failed to be what it should have been even in the pre-C++11 era. It is now deprecated. – softwarelover Sep 08 '12 at 18:23
  • 2
    smart pointers have been around since before C++. They have been in C++ from the beginning (at least C++03 if not before) `std::auto_ptr`. There was nothing wrong with auto_ptr if you knew how it worked and did exactly what it was supposed to do (pass ownership) which is what you are doing above. – Martin York Sep 08 '12 at 18:24
  • 2
    Smart pointers have always been a possible feature of C++ since day zero, at least to the extent of simple scoped ownership (like `boost::scoped_ptr`) or shared ownership (like `std::shared_ptr`). The sad truth is that many real-world programs simply don't really bother with or care about allocation cleanup correctness. – Kerrek SB Sep 08 '12 at 18:24
  • 1
    There seems to be some confusion between smart pointers as a concept, and smart pointers as implemented in the standard library. Even before smart pointers existed in boost or the STL, people coded their own smart pointers. No doubt people even tried to do the same thing in C. – john Sep 08 '12 at 18:24

10 Answers10

5

The two core language features that enable "smart pointers", and more generally the idiom of scope-bound resource management (SBRM, sometimes also onomatopoeically referred to as RAII, for "re­source acquisition is initialization"), are:

  • destructors (automatic gotos)

  • unconstrained variables (every object can occur as a variable)

Both these are fundamental core features of C++ and have always been part of the language. Therefore, smart pointers have been always been imlpementable in C++.

[Incidentally, those two features mean that goto is necessary in C to handle resource allocation and multiple exits in a systematic, general fashion, while they are essentially forbidden in C++. C++ absorbs goto into the core language.]

Like with any language, it takes a long time before people learn, understand and adopt the "correct" idioms. Especially given the historic connections of C++ with C, lots of programmers who were and are working on C++ projects have come from a C background and have presumably found it more comfortable to stick with familiar patterns, which are still supported by C++ even though those are not advisable ("just replace malloc with new everyone and we'll be ready to ship").

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 2
    I always did hate the whole "move at your own pace" which is double-speak for "if you don't know how to use it, do what you've always done." Hated because it flew in the face of the reasons why new alternatives were ponied up in the first place. You have a language with these features. Adopt ASAP and rip the bandage off. it will hurt for a minute, but you'll feel SO much better later. – WhozCraig Sep 08 '12 at 18:35
  • @CraigNelson: That's true, but we have to accept that *everyone* has to learn the correct idioms. At the beginning of a new language, some of those simply don't exist until humankind as a whole comes to understand them. I've had the pleasure and privilege of learning much of my C++ by interacting on this website, but this personal evolution is not all that different from the community as a whole "figures out the right way to do things". Right now, we're at a similar situation with C++11, and I won't be surprised if we'll learn new "right ways to do things" in the future... – Kerrek SB Sep 08 '12 at 18:40
  • "while they are essentially forbidden in C++" - ironically, `goto` is actually *easier* in C++ from the POV of resource handling than it is in C. Because in C++ you can `goto` out of a scope and automatics are destroyed. Part of a long tradition in C++ of making something easy to use at precisely the same time as making it unnecessary. – Steve Jessop Sep 08 '12 at 19:32
  • @SteveJessop: Yes, that's true, but on an idiomatic level, when you feel the need to say `goto`, you probably want a destructor! – Kerrek SB Sep 08 '12 at 19:33
  • 2
    @Kerrek: in practice in C I've systematically used `goto` for two things: cleanup code and doing spaghetti-like state machines without a tiresome "event loop" that adds nothing to the comprehensibility of the code. As you say, the former use is unnecessary in C++. The latter use is rare, since state machines that *I* design are absolute models of elegant structure, clarity and comprehensibility ;-) Very occasionally you have a real situation that genuinely is modelled by pin-balling around a bunch of blocks until one of them halts the state machine by returning. – Steve Jessop Sep 08 '12 at 19:36
2

Okay, staying off the impending discussion of why this isn't relevant and you should be using smart pointers anyway...

All other things being equal (no custom allocators or anything fancy like that) the rule is whoever allocates the memory should deallocate the memory. Third-party functions, such as that in your example, should absolutely never be deallocating memory that it didn't create, mainly because 1) it's bad practice in general (terrible code smell) and more importantly 2) it doesn't know how the memory was allocated to start with. Imagine the following:

int main()
{
    void * memory = malloc(sizeof(int));
    some_awesome_function(memory);
}

// meanwhile, in a third-party library...

void some_awesome_function(void * data)
{
    delete data;
}

What happens if malloc/free and new/delete are operating using different allocators? You're looking at a potential error of some sort because the allocator used for delete has no idea what to do with memory that was allocated by malloc's allocator. You never free memory that was new'd, and you never delete memory that was malloc'd. Ever.

As for the first point, the fact that you have to ask what would happen if a third-party library deallocated memory and you tried to (or didn't try to) manually free it is exactly why things shouldn't be done that way: because you simply have no way of knowing. So, it's accepted practice that whatever portion of code is responsible for allocation is also responsible for deallocation. If everyone sticks to this rule, everyone can keep track of their memory and nobody is left guessing.

Adam Maras
  • 26,269
  • 6
  • 65
  • 91
  • Gospel. There are cases where things might not always appear at first to be "normal" per'se. For example, RPC parameters, specifically [in], [out], [in,out], etc. It may not be clear to the receiver of an [in,out] param that they're responsible for invoking a lib call to free the [in] before populating the [out]. I find such cases to be VERY well documented which sublimes the confusion (but sometimes not much =) – WhozCraig Sep 08 '12 at 21:13
0

You destroy what you create, the library destroys what it creates.

If you share data with the library (for example, a char* for file data), the library's documentation will specify if it keeps a reference to your data (in which case don't delete your copy until the library is done using it) or makes a copy of your data (in which case it's the library's job to delete the data when done).

sirbrialliance
  • 3,612
  • 1
  • 25
  • 15
0

I see a lot people pointing out that smart pointers have been around from the beginning of C++. But the fact is that not all code is using them, even today. A common approach is to do reference counting manually:

void main() {
  B* b = createB(); //refcount = 1 
  a_function(b);
  releaseB(b); //--refcount
}

void a_function(B* b) {
  acquireB(b); //refcount++ when we store the reference somewhere
  ...
}
Timo
  • 5,125
  • 3
  • 23
  • 29
0

smart pointers is a way to ease the implementation of a policy. The same policies (attributing the responsibility to delete to one owner or a set of them) were used. You just had to document the policy and not forget to act accordingly. Smart pointers are both a way to document the chosen policy and to implement it at the same time. In your case, you looked at a_function documentation and saw what it demanded. Or took a more or less educated guess if it wasn't documented.

AProgrammer
  • 51,233
  • 8
  • 91
  • 143
0

What do I do in line 3?

You consult the documentation of a_function. The usual rule is that functions do nothing about ownership or lifetime unless they say they do. The need for such documentation is pretty clearly established by reference to C APIs, where smart pointers aren't available.

So, if it doesn't say that it deletes its parameter, then it doesn't. If it doesn't say that it keeps a copy of its parameter beyond the time when it returns, until some other specified time, then it doesn't.

If it says something you act accordingly, and if it says nothing then you delete b (or preferably you write B b; a_function(&b); instead -- observe that by not destroying the object, the function doesn't need to care how you create the object, you're free to decide).

Hopefully it says whatever it says explicitly, but if you're unlucky it says it via some convention that certain kinds of function in an API take ownership of the objects referred to by their parameters. For example if it's called set_global_B_instance then you might have a sneaking suspicion that it's going to keep that pointer around, and deleting it immediately after setting it would be unwise.

If it doesn't say anything either way, but your code ends up buggy and you eventually discover that a_function called delete on its argument, then you find whoever documented a_function and you punch them in the nose submit a bug on their documentation.

Frequently that person turns out to be yourself, in which case try to learn the lesson -- document object ownership.

As well as helping to avoid coding errors, smart pointers provide some degree of self-documentation for functions that accept or returns pointers where there are ownership concerns. In the absence of self-documentation, you have actual documentation. For example if a function returns auto_ptr instead of a raw pointer, that tells you delete needs to be called on the pointer. You can let the auto_ptr do that for you, or you can assign it to some other smart pointer, or you can release() the pointer and manage it yourself. The function you called doesn't care, and doesn't need to document anything much. If a function returns a raw pointer then it has to tell you something about the lifetime of the object referred to by the pointer, because there's no way for you to guess.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
0

The answer is in the documentation of third party function a_function(). Possible cases could be:

  • the function just uses data in the object, and will not keep references to it after the function call ended (example: printf). You can safely delete the object after the function call ended.
  • the function (in some internal library object) will keep a reference to the object until a later call (let's say to b_function()). You are responsible for deletion of the object, but have to keep it alive until you call b_function (example:strtok).
  • the function takes ownership of the object, and doesn't guarantees the object existence after it's called (example: free()). In this case, the documentation usually specifies how to create the object (malloc, new, my_library_malloc).

These are only some example of many different behavior which could be possible, but as long as the function is documented well enough you should be able to do the right thing.

pqnet
  • 6,070
  • 1
  • 30
  • 51
0

Just look to C APIs for hints. It's pretty common for C APIs to provide explicit create and destroy functions. These typically follow some formal naming convention in the libraries.

Using your example, it would be a bad design if a_function deletes/frees the parameter if it were not explicitly labeled as a destroy function (in which case, you should not use that parameter after calling the function. In most cases, it is a bad design to assume that it is safe to destroy objects you do not own. Of course, with smart pointers, mechanisms of ownership, lifetimes, and cleanup are often handled by the smart pointer where possible.

So yes, people used new and delete, and although I wasn't writing C++ before templates -- it would have been more common to see explicit new and delete in programs. Smart pointers are not a very good means to transfer objects and convey ownership without templates -- which, along with exceptions, were introduced around 1990 (7 years after C++ was available). Naturally, it took some time for compilers to support all these features, and for people to implement containers and improve on those implementations. Note that it was possible before templates, but it wasn't always practical to implement/clone a container for arbitrary types because the language did not support generics well prior to templates. Of course, a concrete class with concrete types could easily accomplish the mechanics of smart pointers where the type was invariant in those days… but that does result in forms of code duplication when generics are not available.

But even today, it's an unusual design for a smart pointer parameter's content object to be replaced or destroyed, unless clearly labeled. The likelihood of this is also decreased because it's also unusual to pass the smart pointer as the parameter, rather then the object it holds. So the number of memory related bugs have decreased since then, but some caution and good ownership conventions should still be observed.

justin
  • 104,054
  • 14
  • 179
  • 226
0

Simple answer: read the docs. This is a common thing in C interfaces, and because resource management is an important part of the interface if the function claims ownership of the object it will be documented.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
0

Without smart pointers, programmers generally adopt the rule that the entity that allocated is responsible for deallocating.

In your example, it would typically be considered Bad Behavior (despite being valid code) for your third party function to delete the pointer passed to it, and you would be expected to delete it in line 3.

This is a social contract between programmers, and a compiler would typically not enforce this.

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180