Thursday, June 21, 2012

SPT #1: Automatically stripping debug code

"Stupid Python Tricks" explores ways to abuse Python features for fun and profit. This is the first in a series. Stupid Python Tricks are generally not best practices and may well be worst practices. Read at your own risk.

The standard Python interpreter ("CPython") has a command line option -O that triggers "optimization." Currently, the only real optimization performed is to ignore assert statements. And they're not simply ignored at runtime; they are never even compiled into the byte code executed by the Python virtual machine.

The value of this optimization is that you can sprinkle asserts liberally throughout your code to make sure it fails fast when something goes wrong, making it easier to debug. Yet, because all those statements are stripped out when Python is run in optimized mode, there's no performance penalty when the code is put into production.

Wouldn't it be great if you could strip all your debug code just as easily? Many of us write functions like this to let us easily turn off our debug messages:

def debug_print(*args):    
    if DEBUG:
        for arg in args: print arg,
        print

This can be optimized a bit to minimize overhead of the debug statements by checking the DEBUG flag only once and defining a do-nothing function when the flag isn't set:

def debug_print(*args):    
    for arg in args: print arg,
    print
if not DEBUG: debug_print = lambda *args: None

But there are still all those function calls to the dummy function being executed when running in non-debug mode. Plus, of course, you still need to define the DEBUG variable. So running in production requires both that you change that variable and put -O on the command line, doubling your chances of getting it wrong.

How can we abuse Python's optimization to actually strip out the calls to our debug_print function? Simple: by writing it as an assertion. To avoid raising an AssertionError, of course, debug_print must always return True.

def debug_print(*args):
    for arg in args: print arg,
    print
    return True

assert debug_print("Checkpoint 1")

Now we just need to run our script with -O and all those debug_print calls will be stripped automatically like we never even wrote them.

If you're using Python 3 (or Python 2.6 or 2.7 with from __future__ import print_function), print is already a function. Seems like a waste to define a new debug_print in that case. But we need the result to be True and print() returns None, which evaluates as False in a Boolean context. Well, you can just write one of the following, any of which is guaranteed to be True (the first only for functions that return None or another falsey value, the other two always) and prevent assert from sounding an alarm.

assert not print("Checkpoint 1")
assert print("Checkpoint 1") or True
assert [print("Checkpoint 1")]

Of these three, the last is what elevates this trick to the height of stupidity. Exploiting the fact that Python considers any non-empty container True, we simply make a list containing the return value from the function we called. The resulting code looks more like an unfamiliar bit of Python syntax than a dirty hack, but a dirty hack it is.

By the way, this Stupid Python Trick obviously also works for calls to loggers or any other function you want to call in debug mode.

Why this is a bad idea: It's very specific to current CPython behavior, which could change in the future, and may not have the desired effects with other Python implementations like IronPython and Jython. (Although it probably wouldn't hurt anything except possibly performance.) Furthermore, it's not really asserting anything about the program (the truth value is guaranteed to be True after all), but rather using the assert statement for its secondary effects, damaging Python's generally excellent readability.

No comments:

Post a Comment