Friday, June 22, 2012

PF #1: Truth and Consequences

Intended for those new to the language but not to programming, "Python Foundations" surveys the parts of Python that you need to know about to take full advantage of Python's power. First in a series.

Many programming languages have somewhat loose concepts of truth and falsity. Objects can have a truth value even if they're not Booleans. In C, truth is canonically represented as -1, but any non-zero value is considered true in a Boolean context such as an if statement.

Python takes this a step further, considering empty containers (including strings) to be false. The constant None is also considered false. Other objects are generally considered true, although this can be overridden (we'll discuss how in a moment). This property is useful for making code like this more readable:

name = raw_input("What is your name? ")
while not name:  # instead of while name == ""
     name = raw_input("Seriously, what's your name? ")

Since truth values are a little flexible, Python programmers have adopted the terms truthy and falsy (or falsey) to refer to an object's implicit truth value when used in a Boolean context. (I hasten to add that I don't believe these terms were coined in the Python community.)

In other words, the list [1, 2, 3] is not literally equal to the constant True, but it is truthy because if you tested it with an if statement, that if statement's body would be executed.

Instances of classes are generally truthy unless they are derived from a class that has some other built-in behavior (for example, a list, which, remember, is truthy when it contains any items). Functions, classes, iterators/generators, and modules are also truthy.

You can override the implicit truth value of your own classes by defining either a __len__() or __nonzero__()* special method. If your class has a __len__() method, it is probably a container, and Python will treat its instances like one: false when its length is zero and true when its length is nonzero. The __nonzero__() method is more explicit and can indicate the instance's truth value even for non-container classes. If a class has both of these methods, __nonzero__() takes precedence.

Here is a list subclass that is always truthy, even when empty:

class truthylist(list):
    def __nonzero__(self):
        return True  # always

When would you want to use such a list? Well, consider a situation in which you are reading records from a database or elements from an XML file and will return a list of them. If an error occurs, you have two choices: raise an exception or return an error code. In some scenarios, it's even convenient to just return an empty list on error, since then you can use the same code path to iterate over it whether there was an error or not. But then you don't know why you got the empty list: was it because there was an error, or because there was no data of the type you requested? The truthy list gives us a solution.

def getrecords(key):
       result =  ...  # get the records here
        return result if result else truthylist()
    except Exception:
        return []
Now, when we call this function, we can just check to see if the result is truthy. If it is, we successfully retrieved the records (even if there are none). At the same time, we retain the ability to iterate over the records without regard for what happened, if that's what we want to do.

records = getrecords("DNA")
for record in records: print record
if records = []: print "No records found",
if not records: print "due to error",

Admittedly, this verges on a Stupid Python Trick. The "empty container is falsy" convention is so engrained, other Python programmers will find truthylist more than a little odd.

In the next installment of Python Foundations, we'll look at Python's logical operators and how implicit truth values interact with them.

*In Python 3, the __nonzero__() special method was renamed __bool__() to better match the other type-coercion methods such as __str__() and __int__().

No comments:

Post a Comment