An interesting Python descriptor quirk

Tue 04 November 2014

You know the cached property technique right? It's like this:

class cached_property(object):
    def __init__(self, func):
        self.func = func

    def __get__(self, obj, cls):
        value = obj.__dict__[self.func.__name__] = self.func(obj)
        return value

Objects implementing __get__ and/or __set__ are called descriptors. If you assign descriptors to a class then attribute lookups and assignments will call __get__ and respectively __set__ on the descriptor.

The cached_property above is very popular due to the fact that no function calls are involved after the first call - making it quite fast on CPython.

You'd use it like this:

class Shape(object):

    @cached_property
    def area(self):
        # compute value
        return value

So armed with knowledge of that technique, I was doing something like this in my code:

class Shape(object):

    @property
    def area(self):
        # compute value
        self.__dict__['area'] = value
        return value

To my bewilderment that didn't work - every lookup called the function. It's looks like it's equivalent to cached_property - what makes it different?

After some digging in CPython sources it turns out that objects use PyObject_GenericGetAttr as the default __getattr__. That in turn calls _PyObject_GenericGetAttrWithDict and it looks like that uses PyDescr_IsData macro to decide if it should look first in __dict__.

The macro:

#define PyDescr_IsData(d) (Py_TYPE(d)->tp_descr_set != NULL)

tp_descr_set is the slot for __set__. So this means that if the descriptor doesn't have a __set__ method then attributes are first looked up in the instance's __dict__.

Interestingly enough, it turns out this was documented all along, albeit not very prominently. There are data descriptors (__set__ is implemented) and non-data descriptors (no __set__).

This allows you to do the cached_decorator trick above. It also allows you to override methods on the instance (think monkey-patching) as functions are actually non-data descriptors.

Other reasoning can be found in PEP-252 (search for data descriptors).

This entry was tagged as python