Tuesday, March 3, 2015

SPT #4: There is no try

"Stupid Python Tricks" explores sexy ways to whip Python features until they beg for mercy. Stupid Python Tricks may well be fifty different shades of gray. Reader discretion is advised.

I've recently been wishing that Python`s set type had a method like add() but which returned a Boolean indicating whether the item being added was already in the set. We could add this behavior to the add() method without fear of breaking much code, since most uses will ignore the return value, but I'd rather keep add() as fast as possible. So let's call this new method added() and have it return True if the item needed to be added, and False otherwise. You can derive a new class from set, so let's go ahead and do that:

class Set(set):
   def added(self, item):
       result = item not in self   # True if item needs to be added
       self.add(item)
       return result

Note our self.added() here is not conditional in any way; it doesn't need to be. set.add() is idempotent: adding the same item multiple times doesn't hurt anything, and it's actually faster to do the add() even if it's not necessary (since that stays in the fast C implementation of the set type) than to try to avoid the unnecessary add() with an if statement.

Our new method is convenient for deduplicating lists while retaining the order of their items:

pies = ["apple", "banana cream", "apple", "boysenberry",
        "apple", "pumpkin", "banana cream"]
seen =  Set()  # keeps track of pies we have already seen
pies[:] = (pie for pie in pies if seen.added(pie))
print(pies)

Result: ["apple", "banana cream", "boysenberry", "pumpkin"]

Be right back; I'm hungry for pie now.

OK. Our added() method works fine. There's nothing wrong with it. But doesn't it seem a little... inelegant... to have to store the result of the set membership test in a local variable, add the new item, and finally return the value we previously squirreled away? Why can't we simply return the result, and then do the add? Because the return ends the function's execution? Don't be silly; we won't let that stop us!

class Set(set):
   def added(self, item):
       try:     return item not in self
       finally: self.add(item)

Not only is this an unconventional use of try, it's also a wee bit slower than our earlier version. And that's why we call it "Stupid Python Tricks."