Many functions in the C API for Python are not safe to use if the error indicator might be set. In particular, PyFloat_AsDouble
and similar functions are ambiguous in that they have no return value reserved for indicating an error: if they succeed (but happen to return the value used for errors), the client that calls PyErr_Occurred
will believe them to have failed if the error indicator was simply already set. (Note that this is more or less guaranteed to happen with PyIter_Next
.) More generally, any function which can fail overwrites the error indicator if it does, which may or may not be desirable.
Unfortunately, the possibility of calling such functions with the error indicator set is not at all unlikely: a common reaction to an error is to Py_DECREF
local variables, and (unless the types of all objects that might be (indirectly) freed by it are known) that can execute arbitrary code. (This is a good example of the danger of having cleanup code with the possibility of failure.) The interpreter catches exceptions raised in such destructors, but it does not prevent exceptions from leaking into them.
At either end, we can use PyErr_Fetch
and PyErr_Restore
to prevent these issues. Put around a call to an ambiguous function, they allow reliably determining whether it succeeded; put around Py_DECREF
, they prevent the error indicator from being set during the execution of whatever susceptible code in the first place. (They can also be used even around directly-invoked cleanup code that might fail, so as to allow choosing which exception to propagate. There’s no question about where to put it in this case: the cleanup code can’t choose between multiple exceptions anyway.)
Either choice of placement significantly increases code complexity and execution time: there are a lot of calls to ambiguous functions, and there are a lot of Py_DECREF
s on error-handling paths. While the principle of defensive programming would suggest using it in both places, much nicer code would result from (careful programming with) a universal convention (to cover the arbitrary code being executed).
C itself has such a convention: errno
must be saved by the caller of arbitrary code even if (like the suppressed exceptions in Python destructors) that code is not expected to set errno
to anything. The main reason is that it can be reset (but never to 0) by many successful library calls (to let them handle errors internally), further narrowing the set of operations safe to perform while errno
holds some significant value. (This also prevents the issue that arises when PyErr_Occurred
reports on a preexisting error: C programmers must set errno
to 0 before calling an ambiguous function.) Another reason is that “call some arbitrary code with no error reporting” is not a common operation in most C programs, so burdening other code for its sake would be nonsensical.
Is there such a convention (even if there is buggy code that doesn’t follow it in CPython itself)? Failing that, is there a technical reason to guide the choice of one to establish? Or maybe is this an engineering problem based on too literal a reading of “arbitrary”: should CPython save and restore the error indicator itself while it’s handling destructor exceptions anyway?