Friday, November 8, 2013

SPT #2: One global to rule them all

"Stupid Python Tricks" explores ways to abuse Python features for fun and profit. Stupid Python Tricks may well be diametrically opposed to best practices. Don your peril-sensitive sunglasses before proceeding.

You know how when you have a lot of globals and it's a pain to modify them, because you have to declare the globals you want to modify in each function using the global statement? Yeah, me neither. After all, globals are a poor programming practice! And I never engage in those...

However, for educational porpoises, or at least didactic dolphins, here's a short hack that takes advantage of the fact that Python globals can be accessed, but not modified, without declaring them global. Basically, what we do is replace the built-in globals() function with an object that lets you read and set globals via attribute access. Instead of doing this:

def my_func(a, b, c):
    global ans
    ans = a + b + c

You can instead do:

import globals

def my_func(a, b, c):
    globals.ans = a + b + c

This works fine because you aren't re-binding any global names; instead, you are merely mutating an existing object. Or so Python believes... bwahahaha! While we didn't save any lines of code in this short example, imagine if we had dozens of functions that used globals. We'd save literally dozens of global statements! And we could dine on spaghetti code for weeks.

Calling this object, i.e. globals(), continues to return the modules's globals dictionary, just like before, thanks to a __call__() method on the class.

Without further ado, then, here's

from inspect import currentframe
from sys import modules

class Globals(object):

    def __getattribute__(self, key, currentframe=currentframe):
            return currentframe().f_back.f_globals[key]
        except KeyError:
            pass # if we raise NameError here we get 2 err msgs
        raise NameError("global name '%s' is not defined" % key)

    def __setattr__(self, key, value, currentframe=currentframe):
        currentframe().f_back.f_globals[key] = value

    def __call__(self, currentframe=currentframe):
        return currentframe().f_back.f_globals

globals = Globals()
modules[__name__] = globals   # so other modules will use it
import globals                # make sure that worked

# simple test
globals.answer = 42
assert globals.answer is answer

This short bit of code demonstrates more than one dirty, dirty hack. inspect.currentframe() is used to allow us to manipulate the globals of the module containing the function it's called from, rather than the globals of its own module. We assign into sys.modules to replace the globals module object with our Globals class instance so that you only need import globals, not from globals import globals. Since it's a functional replacement for the built-in globals() function, we could stick it into the __builtins__ namespace so other modules would get it even without importing it, but even I have my limits!

Monday, March 18, 2013

IDTKAP #4: __debug__ and -O

Wayyyyy back in June 2012, I posted the first Stupid Python Trick, showing a way to abuse the assert statement to write debug code that is completely stripped when you run Python using the -O flag to enable optimization.

The -O flag is documented as removing assert instructions as though you never wrote them. But that's not all it does. It also sets a constant, __debug__, which is normally True, to False. The value of __debug__ is known at compile time, so Python can use it to completely discard conditional blocks predicated on __debug__, just as it does with assert statements, when running with the -O flag. And in fact, this is exactly what Python does!

The upshot is that you can write debug code that is stripped by Python's -O flag without abusing assert. This is easily demonstrated using Python's bytecode dissambler, dis.

from dis import dis

def debug_func():
    if __debug__:
       print "debugging"

def noop_func():

print "debug_func:"
print "noop_func:"

Save this as, then execute it with python -O You'll see that the disassembly of the two functions is identical aside from the line number offsets. It's just as if we never even wrote the if statement and its subordinate print statement! Literally zero performance impact to production code.

  4           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE

  2           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE

What's more, when we run this script without the -O flag, Python optimizes away the test. That is, Python knows __debug__ is true at compile time, and so it just compiles the code inside the if statement as if it weren't inside an if statement! Here's what the disassembly of debug_func looks like when __debug__ is True (i.e., no -O flag is used):

 3           0 LOAD_CONST               1 ('debugging')
             3 PRINT_ITEM
             4 PRINT_NEWLINE

 4           5 LOAD_CONST               0 (None)
             8 RETURN_VALUE

By comparison, here's what it would look like if we were using some other conditional (say, a global variable called DEBUG). You'll see that this is much more complicated, and if you time it, you'll find that executing the test at runtime actually adds significant overhead.

  2           0 LOAD_GLOBAL              0 (DEBUG)
              3 JUMP_IF_FALSE            9 (to 15)
              6 POP_TOP

  3           7 LOAD_CONST               1 ('debugging')
             10 PRINT_ITEM
             11 PRINT_NEWLINE
             12 JUMP_FORWARD             1 (to 16)
        >>   15 POP_TOP
        >>   16 LOAD_CONST               0 (None)
             19 RETURN_VALUE

So basically, Python will not only strip debugging code if it's conditionalized by testing __debug__, it will also slightly improve the performance of your debug code when running in debug mode compared to testing a runtime flag. And best of all, it does this magic using the same command line flag, -O, that strips assert statements! (For completeness, I should mention here that the PYTHONOPTIMIZE environment variable serves the same function as -O.)

But wait, there's more! If you use an else clause with your if __debug__ statement, Python is smart enough to strip whichever clause doesn't apply and "inline" the clause that does!

def get_run_mode():
    if __debug__:
        return "debug"
        return "production"

dis(get_run_mode) running without -O:
  3           0 LOAD_CONST               1 ('debug')
              3 RETURN_VALUE

dis(get_run_mode) running with -O:
  5           0 LOAD_CONST               1 ('production')
              3 RETURN_VALUE

Once again, for comparison, here's how the bytecode looks when the function is written to force runtime evaluation of the condition, by using a global variable DEBUG instead of __debug__:

 2           0 LOAD_GLOBAL              0 (DEBUG)
             3 JUMP_IF_FALSE            5 (to 11)
             6 POP_TOP

 3           7 LOAD_CONST               1 ('debug')
            10 RETURN_VALUE
       >>   11 POP_TOP

 5          12 LOAD_CONST               2 ('production')
            15 RETURN_VALUE
            16 LOAD_CONST               0 (None)
            19 RETURN_VALUE

So, is Python smart enough to optimize if not __debug__ in the same way? Sadly, no:

def not_debug_test():
    if not __debug__:
        print "production"


>>> dis(not_debug_test)
  2           0 LOAD_GLOBAL              0 (__debug__)
              3 JUMP_IF_TRUE             9 (to 15)
              6 POP_TOP

  3           7 LOAD_CONST               1 ('production')
             10 PRINT_ITEM
             11 PRINT_NEWLINE
             12 JUMP_FORWARD             1 (to 16)
        >>   15 POP_TOP
        >>   16 LOAD_CONST               0 (None)
             19 RETURN_VALUE

So if you want to write code that's run only in production, don't use if not __debug__. Write it like this instead:

    if __debug__:
         print "production"

This is ugly, but arguably, it should be: you generally shouldn't write code that is only run in production, because it doesn't get tested.

What about conditional expressions, such as x = "yes" if __debug__ else "no"? Sadly, Python does not optimize these. Similarly, __debug__ and x and __debug__ or x are not optimized, though they could be.

So what did we learn?
  1. Use if __debug__ to write debug code (along with else if desired).
  2. Don't make up your own flag for this, as it will prevent Python from being clever.
  3. Don't  use if not __debug__ because this will also prevent Python from being clever.
  4. Prefer if statements to using __debug__ in logical expressions.
  5. Use assert to assert invariants, not to perform stupid Python tricks like I presented last June.
  6. Use the -O command line flag (or PYTHONOPTIMIZE) to tell Python when it's running in production. If you don't, you may be executing debugging code you don't want, with the potential performance degradation that implies.
Thanks to user Reddit user "brucifer" who posted this informative comment, and to user "Rainfly_X" who brought it to my attention.

By the way, in Python 3.x, True and False are also constants whose value is known at compile-time, and Python optimizes if True and if False similarly. In Python 2.x, the values of True and False can be changed at run time (seriously, try it if you don't believe me!), so this optimization isn't possible. None can't be changed in Python 2.x, but is only a true compile-time constant in Python 3.x, with the upshot that code under if None is also subject to being stripped out in Python 3.x but not in Python 2.x.