39

I've been trying to find RAII in Python. Resource Allocation Is Initialization is a pattern in C++ whereby an object is initialized as it is created. If it fails, then it throws an exception. In this way, the programmer knows that the object will never be left in a half-constructed state. Python can do this much.

But RAII also works with the scoping rules of C++ to ensure the prompt destruction of the object. As soon as the variable pops off the stack it is destroyed. This may happen in Python, but only if there are no external or circular references.

More importantly, a name for an object still exists until the function it is in exits (and sometimes longer). Variables at the module level will stick around for the life of the module.

I'd like to get an error if I do something like this:

for x in some_list:
    ...

... 100 lines later ...

for i in x:
    # Oops! Forgot to define x first, but... where's my error?
    ...

I could manually delete the names after I've used it, but that would be quite ugly, and require effort on my part.

And I'd like it to Do-What-I-Mean in this case:

for x in some_list:
    surface = x.getSurface()
    new_points = []
    for x,y,z in surface.points:
        ...     # Do something with the points
        new_points.append( (x,y,z) )
    surface.points = new_points
    x.setSurface(surface)

Python does some scoping, but not at the indentation level, just at the functional level. It seems silly to require that I make a new function just to scope the variables so I can reuse a name.

Python 2.5 has the "with" statement but that requires that I explicitly put in __enter__ and __exit__ functions and generally seems more oriented towards cleaning up resources like files and mutex locks regardless of the exit vector. It doesn't help with scoping. Or am I missing something?

I've searched for "Python RAII" and "Python scope" and I wasn't able to find anything that addressed the issue directly and authoritatively. I've looked over all the PEPs. The concept doesn't seem to be addressed within Python.

Am I a bad person because I want to have scoping variables in Python? Is that just too un-Pythonic?

Am I not grokking it?

Perhaps I'm trying to take away the benefits of the dynamic aspects of the language. Is it selfish to sometimes want scope enforced?

Am I lazy for wanting the compiler/interpreter to catch my negligent variable reuse mistakes? Well, yes, of course I'm lazy, but am I lazy in a bad way?

markets
  • 9,344
  • 7
  • 34
  • 33
  • 7
    RAII *is* equivalent to Python's `with` statement (I'm actually surprised that acronym didn't make it into PEP 343 - it was certainly thrown around a lot in the associated discussions). The missing element you're asking for is the anonymous scoping rules from C/C++ and no, Python doesn't have those (although you may find the deferred PEP 3150 of interest). That's completely independent of the RAII question, though. – ncoghlan Feb 21 '11 at 22:10
  • @ncoghlin: Thanks for the answer. I thought PEP 3150 was answering my question, but then I realized that it says nothing of removing the namespace. It is purely to move the function call above the code that gets values for parameters so that it is more obvious what the code is accomplishing. What I don't like about `with` is that the name doesn't go away, and that I have to move code into `__exit__` instead of `__del__`. Perhaps there is a clever wrapper object that would `del` the object name and not require that I code an `__exit__` function? Ideally it would assert that the ref count was 0. – markets Feb 21 '11 at 23:13
  • 5
    even if the refcount > 0, you can `del` the object name from your own scope and get rid of the name - this is totally separate from actually destroying the object. For instance, if an object had been inserted into a container, its refcount would be > 0, but you could still `del` the name you used to refer to the object itself. Maybe that's part of your confusion - `del` deletes the name, not the underlying object. This is one distinction in Python that C/C++ people struggle with - you are dealing with names and bindings to names, not variables/references that are alloc'ed and freed. – PaulMcG Feb 22 '11 at 15:55

6 Answers6

41

tl;dr RAII is not possible, you mix it up with scoping in general and when you miss those extra scopes you're probably writing bad code.

Perhaps I don't get your question(s), or you don't get some very essential things about Python... First off, deterministic object destruction tied to scope is impossible in a garbage collected language. Variables in Python are merely references. You wouldn't want a malloc'd chunk of memory to be free'd as soon as a pointer pointing to it goes out of scope, would you? Practical exception in some circumstances if you happen to use ref counting - but no language is insane enough to set the exact implementation in stone.

And even if you have reference counting, as in CPython, it's an implementation detail. Generally, including in Python which has various implementations not using ref counting, you should code as if every object hangs around until memory runs out.

As for names existing for the rest of a function invocation: You can remove a name from the current or global scope via the del statement. However, this has nothing to do with manual memory management. It just removes the reference. That may or may not happen to trigger the referenced object to be GC'd and is not the point of the exercise.

  • If your code is long enough for this to cause name clashes, you should write smaller functions. And use more descriptive, less likely-to-clash names. Same for nested loops overwriting the out loop's iteration variable: I'm yet to run into this issue, so perhaps your names are not descriptive enough or you should factor these loops apart?

You are correct, with has nothing to do with scoping, just with deterministic cleanup (so it overlaps with RAII in the ends, but not in the means).

Perhaps I'm trying to take away the benefits of the dynamic aspects of the language. Is it selfish to sometimes want scope enforced?

No. Decent lexical scoping is a merit independent of dynamic-/staticness. Admittedly, Python (2 - 3 pretty much fixed this) has weaknesses in this regard, although they're more in the realm of closures.

But to explain "why": Python must be conservative with where it starts a new scope because without declaration saying otherwise, assignment to a name makes it a local to the innermost/current scope. So e.g. if a for loop had it's own scope, you couldn't easily modify variables outside of the loop.

Am I lazy for wanting the compiler/interpreter to catch my negligent variable reuse mistakes? Well, yes, of course I'm lazy, but am I lazy in a bad way?

Again, I imagine that accidential resuse of a name (in a way that introduces errors or pitfalls) is rare and a small anyway.

Edit: To state this again as clearly as possible:

  • There can't be stack-based cleanup in a language using GC. It's just not possibly, by definition: a variable is one of potentially many references to objects on the heap that neither know nor care about when variables go out of scope, and all memory management lies in the hands of the GC, which runs when it likes to, not when a stack frame is popped. Resource cleanup is solved differently, see below.
  • Deterministic cleanup happens through the with statement. Yes, it doesn't introduce a new scope (see below), because that's not what it's for. It doesn't matter the name the managed object is bound to isn't removed - the cleanup happened nonetheless, what remains is a "don't touch me I'm unusable" object (e.g. a closed file stream).
  • Python has a scope per function, class, and module. Period. That's how the language works, whether you like it or not. If you want/"need" more fine-grained scoping, break the code into more fine-grained functions. You might wish for more fine-grained scoping, but there isn't - and for reasons pointed out earlier in this answer (three paragraphs above the "Edit:"), there are reasons for this. Like it or not, but this is how the language works.
Jan Rüegg
  • 9,587
  • 8
  • 63
  • 105
  • Thanks for the answer. Ideally I'd indicate a variable should have limited scope with a keyword like "auto". If at the end of the block there are any additional reference to it, then it should raise an Exception. If not, then the object should be immediately destroyed. That should be okay with GC, shouldn't it? Regarding using smaller functions: I don't have time to spend refactoring other peoples code, and it wouldn't solve the problem any ways, since it can happen in (nearly) arbirarily small functions... it is just easier to happen in large functions. – markets Feb 21 '11 at 21:29
  • 1
    @markets: That would be a pretty large feature though, and since the suggestion comes from a C++ programmer not really comfortable with Python yet... not to mention that the whole point of GC is not having to worry about who has references to whom and when it's safe to delete (or rather, when it can be deleted by some memory manager). –  Feb 21 '11 at 21:33
  • 1
    @markets: ... And as for name clashes: Really, I (and judging from the general lack of comments on this, it seems I'm not alone here) encounter this problem so rarely I hardly notice, and if it occurs then there's an equally good name for one of the two. And if it does become unbearable, well sorry, but then you'll have to find another scope, either by breaking the function up or finding a hidden feature in Python that provides this... –  Feb 21 '11 at 21:36
  • 1
    "If at the end of the block there are any additional reference to it, then it should raise an Exception. If not, then the object should be immediately destroyed." That would mean you'd have to run a GC every time a variable goes out of scope, which is not feasible for all implementations. There's a reason no garbage collected language implements destructors in the same way as C++ and this plays into this problem here as well. RAII is mostly a pattern invented in C++ because of its shortcomings (I wouldn't dare to declare a large c++ project exception safe if it didn't use RAII) – Voo Feb 21 '11 at 22:13
  • @Voo: would you think a Python program is exception safe if it doesn't use RAII? That's my point. I'm not questioning whether RAII is useful. Delnan seems to think that RAII is marginally useful. Perhaps I'm not thinking Pythonic enough, maybe I'm clinging to the safety of the stack. But I think it is a nearly necessary tool for correct logic flow and exception safety. And why would you have to run GC? Just check the ref count. You never get a ref count *too low* that requires you to do a full GC. – markets Feb 21 '11 at 22:37
  • @delnan: I've not looking into the source code myself, but I think it would be easier than the with-statement, recently added. In certain circumstances you might want to assert that there are no external references. If the resource won't close until the ref goes to zero, I have a program-correctness interest in asserting that. GC is an imperfect model for the behavior I want, but it is the closest thing that Python offers. – markets Feb 21 '11 at 22:46
  • 2
    @markets: Well, what are you asking RAII for? For cleanup? That's what `with` is for. Because resource cleanup should be in `__exit__` or, if you're a poor soul who never heard of context managers, in some externally managable resource. GC is pretty much by definition non deterministic, and you won't get Python to perform a face heel turn and re-introduce manual memory management or some slightly discharged variant of it. Work with the language as it is or use another one. What would you think about demands for GC in C++? –  Feb 21 '11 at 22:54
  • @delnan: GC is nondeterministic and that's a good thing: that's fine, I'm not advocating that we change GC, I'm asking if I'm not grokking Pythonic thinking by wanting RAII. Everybody seems to be saying yes, but they aren't telling me how I *should* be thinking. What is the replacement? `__exit__` is a poor replacement since I want the name to go away without a `del`, and `__exit__` isn't called except from `with`. It could be much simpler. – markets Feb 21 '11 at 23:06
  • 3
    @markets: As I already said, `with` *is* the way to go for deterministic cleanup. And although the name remains after the `with` block finished, the cleanup is done and thus you get the same effect as applied RAII, albeit through other means. Or do you misuse the term RAII and actually just want more fine-grained scoping? –  Feb 21 '11 at 23:19
  • @delnan: Mostly I'm looking for the right Python meme that takes the place of C++ stack based scopes. It seems to me like it could (and should) be supported in Python, unless I'm thinking about it wrong. I'm interested in both the name scope and the resource cleanup, in a clean representation. I'll be shocked if the bottom line is that C++ has a cleaner representation for this than Python! – markets Feb 21 '11 at 23:31
  • @markets: A quite long answer and a dozen long comments didn't get anywhere. Perhaps you should edit the question to be *very* precise, like, with examples. I don't get how the things you named now relate, nor how we didn't answer them. –  Feb 21 '11 at 23:37
  • @delnan: quite right, we haven't gotten anywhere. My initial research showed me `with` but after initial excitement I realized it wasn't what I was looking for. I'm resigned to Python having no appropriate stack-based scoping alternative (that includes the name). Perhaps I should remove RAII from my question, but I feel it is an integral concept to the issue of the name scope. The question has two examples which both illustrate the issue of the variable name not going away. Thanks for the help, though. At least now I know I'm not missing something obvious. – markets Feb 21 '11 at 23:46
  • 1
    @markets: I feel you still don't get it (where "it" is "you're asking the wrong question" and "why it's the wrong question"). Please read my edit. –  Feb 22 '11 at 14:30
  • 1
    @delnan: Thanks for all your effort on this, but I think I get `with` and it isn't what I want, but I guess that is the "Pythonic way", which is part of my question. BTW, your last point seems to be contradicted by the generator scope, and list comprehensions have their own scope in Python 3000 (see other answers). Things seem to be trending my way. This also invalidates your first point, as I don't demand stack based implemetation, just behavior. Let's just agree that I'm stuck in a C++ mindset and I want names to go out of scope and Python doesn't meet me there yet. I am unPythonic. :( – markets Feb 22 '11 at 16:46
  • @markets: Yes, generators an comprehensions don't leak the iteration variable(s) to the outside, but I glossed over them as they are very limited - only one or a few variables that are used through one iteration, and not used for side effects (well, that's the idiom). As for "stack-based behaviour" - *there is a stack* in Python as in about every other language. The only difference you seem to care about is how frequently a new scope/stack frame is started. –  Feb 22 '11 at 17:00
  • 1
    @markets: RAII is *not* an integral concept to the issue of name scope. That's a very C++-specific view. Older languages tend not to have destructors and newer languages tend not to have value semantics. – dan04 Feb 23 '11 at 08:06
  • Can we verify this answer is still correct in 2016? The documentation for \_\_del\_\_ says it's triggered by reference counts and not by GC. I also slapped together this code sample that seems to show the destructor being invoked at end of scope even without GC. repl.it/ClkR So it *seems* like RAII is absolutely possible. – Jeff M Aug 10 '16 at 17:38
  • @JeffM The answer holds. As I mentioned (but not stressed a lot because it's ultimately a red herring), in CPython you do have refcounting and thus in simple cases get immediate cleanup. However, **this is an implementation detail** (e.g. PyPy and Jython differ). Furthermore, as the [`__del__` documentation] mentions, (1) `del` does not at all guarantee that `__del__` is called (if there are other references), and (2) circular references are not cleaned up by refcounting alone. –  Aug 10 '16 at 17:51
  • quote:"deterministic object destruction tied to the scope is impossible in a garbage collected language" - This is nonsense. The object can be finalized immediately when the reference count reaches 0, the memory management does not have anything to do with it, whether the memory is freed to reuse during garbage collecting or immediately pushed back to help is a separate question. – Martin Skalský May 19 '23 at 09:04
18
  1. You are right about with -- it is completely unrelated to variable scoping.

  2. Avoid global variables if you think they are a problem. This includes module level variables.

  3. The main tool to hide state in Python are classes.

  4. Generator expressions (and in Python 3 also list comprehensions) have their own scope.

  5. If your functions are long enough for you to lose track of the local variables, you should probably refactor your code.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 9
    Point 5 suggests that Sven has nothing better to do with his time than refactor other people's code whenever he is called upon to touch it. While I sympathize with the sentiment, and I've certainly read it a lot in books, I don't have that luxury at work. – markets Feb 21 '11 at 22:50
12

But RAII also works with the scoping rules of C++ to ensure the prompt destruction of the object.

This is considered unimportant in GC languages, which are based on the idea that memory is fungible. There is no pressing need to reclaim an object's memory as long as there's enough memory elsewhere to allocate new objects. Non-fungible resources like file handles, sockets, and mutexes are considered a special case to be dealt with specially (e.g., with). This contrasts with C++'s model that treats all resources the same.

As soon as the variable pops off the stack it is destroyed.

Python doesn't have stack variables. In C++ terms, everything is a shared_ptr.

Python does some scoping, but not at the indentation level, just at the functional level. It seems silly to require that I make a new function just to scope the variables so I can reuse a name.

It also does scoping at the generator comprehension level (and in 3.x, in all comprehensions).

If you don't want to clobber your for loop variables, don't use so many for loops. In particular, it's un-Pythonic to use append in a loop. Instead of:

new_points = []
for x,y,z in surface.points:
    ...     # Do something with the points
    new_points.append( (x,y,z) )

write:

new_points = [do_something_with(x, y, z) for (x, y, z) in surface.points]

or

# Can be used in Python 2.4-2.7 to reduce scope of variables.
new_points = list(do_something_with(x, y, z) for (x, y, z) in surface.points)
dan04
  • 87,747
  • 23
  • 163
  • 198
3

When switching to Python after years of C++, I have found it tempting to rely on __del__ to mimic RAII-type behavior, e.g. to close files or connections. However, there are situations (e.g. observer pattern as implemented by Rx) where the thing being observed maintains a reference to your object, keeping it alive! So, if you want to close the connection before it is terminated by the source, you won't get anywhere by trying to do that in __del__.

The following situation arises in UI programming:

class MyComponent(UiComponent):

    def add_view(self, model):
        view = TheView(model) # observes model
        self.children.append(view)

    def remove_view(self, index):
        del self.children[index] # model keeps the child alive

So, here is way to get RAII-type behavior: create a container with add and remove hooks:

import collections

class ScopedList(collections.abc.MutableSequence):

    def __init__(self, iterable=list(), add_hook=lambda i: None, del_hook=lambda i: None):
        self._items = list()
        self._add_hook = add_hook
        self._del_hook = del_hook
        self += iterable

    def __del__(self):
        del self[:]

    def __getitem__(self, index):
        return self._items[index]

    def __setitem__(self, index, item):
        self._del_hook(self._items[index])
        self._add_hook(item)
        self._items[index] = item

    def __delitem__(self, index):
        if isinstance(index, slice):
            for item in self._items[index]:
                self._del_hook(item)
        else:
            self._del_hook(self._items[index])
        del self._items[index]

    def __len__(self):
        return len(self._items)

    def __repr__(self):
        return "ScopedList({})".format(self._items)

    def insert(self, index, item):
        self._add_hook(item)
        self._items.insert(index, item)

If UiComponent.children is a ScopedList, which calls acquire and dispose methods on the children, you get the same guarantee of deterministic resource acquisition and disposal as you are used to in C++.

Jonathan Zrake
  • 603
  • 6
  • 9
2

Basically you are probably using the wrong language. If you want sane scoping rules and reliable destruction then stick with C++ or try Perl. The GC debate about when memory is released seems to miss the point. It's about releasing other resources like mutexes and file handles. I believe C# makes the distinction between a destructor that is called when the reference count goes to zero and when it decides to recycle the memory. People aren't that concerned about the memory recycling but do want to know as soon as it is no longer referenced. It's a pity as Python had real potential as a language. But it's unconventional scoping and unreliable destructors (or at least implementation dependent ones) means that one is denied the power you get with C++ and Perl.

Interesting the comment made about just using new memory if it's available rather than recycling old in GC. Isn't that just a fancy way of saying it leaks memory :-)

tony
  • 31
  • 1
  • 1
    Thanks for the answer, Tony. I agree that not supporting it in Python is unfortunate. Oddly, I don't see why it can't be supported. It would seem to just be syntactic sugar to turn a variable assignment into a try block where it deletes the name in the finally section. I have yet to hear an answer that convinces me that it is better as currently implemented in Python. – markets Aug 26 '11 at 17:33
  • 4
    "The GC debate about when memory is released seems to miss the point. It's about releasing other resources like mutexes and file handles. " - in C#, Python, JavaScript and Java, they have no automatic resource management, only **memory** GC. C++, PHP, Perl and even GNU C with attributes has RAII. I don't understand how to write _correct_ code on those GC languages. – Brian Cannard Sep 05 '14 at 10:31
1

As some of the comments have indicated, context managers (using with) are the way to have RAII in Python (although as many of the answers indicate, it's more about non-memory resources since GC takes care of memory).

Custom context managers can be implemented by defining __enter__ and __exit__ methods. For example, from https://dev.to/fronkan/comparing-c-raii-and-python-context-managers-50eg:

class PrintingTempFileContext:
    def __enter__(self):
        print("<Opening File>")
        self.file = TemporaryFile(mode="w+t")
        return self.file

    def __exit__(self, exception_type, exception_value, traceback):
        print(f"<Exception info: {exception_type=} - {exception_value=} - {traceback=}>")
        self.file.seek(0)
        print(self.file.read())
        self.file.close()
        print("<File closed>")
with PrintingTempFileContext() as tempfile:
    tempfile.write("Hello DEV!")
Noel Yap
  • 18,822
  • 21
  • 92
  • 144