Nikolaus Rath's Website

On the Beauty of Python's ExitStack

I believe Python's ExitStack feature does not get the recognition it deserves. I think part of the reason for this is that its documentation is somewhere deep down in the (already obscure) contextlib module because formally ExitStack is just one of many available context managers for Python's with statement. But ExitStack deserves far more prominent notice than that. This post will hopefully help with that.

So what makes ExitStack so important? In short, it's the best way to handle allocation and release of external resources in Python.

The Problem

The main challenge with external resources is that you have to release them when you don't need them anymore -- and in particular you must not forget to do so in all the alternate execution paths that may be entered in case of error conditions.

Most languages implement error conditions as "exceptions" that can be "caught" and handled (Python, Java, C++), or as special return values that you need to check to determine if an error occured (C, Rust, Go). Typically, code that needs to acquire and release external resources then looks like this:

res1 = acquire_resource_one()
try:
    # do stuff with res1
    res2 = acquire_resource_two()
    try:
        # do stuff with res1 and res2
    finally:
        release_resource(res2)
finally:
   release_resource(res1)

or, if the language doesn't have exceptions:

res1 = acquire_resource_one();
if(res == -1) {
   retval = -1;
   goto error_out1;
}
// do stuff with res1
res2 = acquire_resource_two();
if(res == -1) {
   retval = -2;
   goto error_out2;
}
// do stuff with res1 and res2
retval = 0; // ok

error_out2:
  release_resource(res2);
error_out1:
  release_resource(res1);
return retval;

This approach has three big problems:

  1. The cleanup code is far away from the allocation code.
  2. When the number of resources increases, indentation levels (or jump labels) accumulate, making things hard to read.
  3. Managing a dynamic number of resources this way is impossible.

In Python, some of these issues can be alleviated by using the with statement:

 @contextlib.contextmanager
 def my_resource(id_):
     res = acquire_resource(id_)
     try:
         yield res
     finally:
         release_source(res)

with my_resource(RES_ONE) as res1, \
   my_resource(RES_TWO) as res2:
    # do stuff with res1
    # do stuff with res1 and res2

However, this solution is far from optimal: you need to implement resource-specific context managers (note that in the above example we silently assumed that both resources can be acquired by the same function), you can get rid of extra indentation only if you allocate all the resources at the same time and live with an ugly continuation line (no parenthesis allowed in this context), and you still need to know the number of required resources ahead of time.

Over in the world of exception-less programming languages (no pun intended), Go has developed a different remedy: the defer statement defers execution of an expression until the enclosing function returns. Using defer, the above example can be written as:

res1 = acquire_resource_one()
if(res == NULL) {
    return -1
}
defer release_resource(res1)
// do stuff with res1
res2 = acquire_resource_two()
if(res == NULL) {
    return -2
}
defer release_resource(res2)
// do stuff with res1 and res2
return 0

This is pretty nice: allocation and cleanup are kept close together, no extra indentation or jump labels are required, and converting this to a loop that dynamically acquires multiple resources would be straightforward. But there are still some drawbacks:

  • To control when exactly a group of resources is getting released you have to factor out into separate functions all parts of code that access the respective resources.
  • You cannot "cancel" a deferred expression, so there is no way to e.g. return a resource to the caller if no error occured.
  • There is no way to handle errors from the cleanup functions.
  • defer is available in Go, but not in Python.

ExitStack to the rescue

ExitStack fixes all of the above issues, and adds some benefits on top of it. An ExitStack is (as the name suggests) a stack of clean-up functions. Adding a callback to the stack is the equivalent of calling Go's defer statement. However, clean-up functions are not executed when the function returns, but when execution leaves the with block - and until then, the stack can also be emptied again.

Finally, clean-up functions itself may raise exceptions without affecting execution of other clean-up functions. Even if multiple clean-ups raise exceptions, you are will get a usable stacktrace.

Here's how to acquire multiple resources:

with ExitStack() as cm:
    res1 = acquire_resource_one()
    cm.callback(release_resource, res1)
    # do stuff with res1
    res2 = acquire_resource_two()
    cm.callback(release_resource, res2)
    # do stuff with res1 and res2

Note that

  • acquisition and release are close to each other
  • there's no extra indentation,
  • the pattern and it easily scales up to many resources (including a dynamic number that's acquired in a loop)

If there already is a context manager for your resource, there's also a shortcut function:

with ExitStack() as cm:
    res1 = cm.enter(open('first_file', 'r'))
    # do stuff with res1
    res2 = cm.enter(open('second_file', 'r'))
    # do stuff with res1 and res2

To open a bunch of files and return them to the caller (without leaking already opened files if a subsequent open fails):

def open_files(filelist):
    fhs = []
    with ExitStack() as cm:
        for name in filelist:
            fhs.append(cm.enter(open(name, 'r')))
        cm.pop_all()
        return fhs

Disclaimer: the original idea for ExitStack came from me.

Comments