A lazy object proxy is an object that wraps a callable but defers the call until the object is actually required, and caches the result of said call.
These kinds of objects are useful in resolving various dependency issues, few examples:
- Objects that need to held circular references at each other, but at different stages. To instantiate object Foo you need an instance of Bar. Instance of Bar needs an instance of Foo in some of it methods (but not at construction). Circular imports sound familiar?
- Performance sensitive code. You don't know ahead of time what you're going to use but you don't want to pay for allocating all the resources at the start as you usually need just few of them.
There are other examples, I've just made up a couple for context.
If you've used Django you may be familiar with SimpleLazyObject. For simple use-cases it's fine, and if you're already using Django the choice is obvious. Unfortunately it's missing many magic methods, most glaring omissions: __iter__, __getslice__, __call__ etc. It's not too bad, you can just subclass and add them yourself.
But what if you need to have __getattr__? The horrors of the infinite recursive call beckon.
Meanwhile I've noticed that wrapt has a quite complete object proxy. Unfortunately it's not really amendable to adding a lazy behavior in a subclass due to the C extension (I wouldn't make bets on sub-classing the pure-python proxy implementation either without some unwanted overhead :-).
Thus I forked the code and changed everything to have the lazy behavior. You can see the results here: https://github.com/ionelmc/python-lazy-object-proxy
Part of that is a C extension packaging exercise but that's for another blog-post [2].
I've also done some benchmarks (with pytest-benchmark) [1]:
-- benchmark: min 5 rounds (of min 25.00us), 30.00s max time, timer: time.perf_counter -- Name (time in ns) Min Max Mean StdDev Rounds Iterations ----------------------------------------------------------------------------------------- test_perf[slots] 606.8182 26084.0909 627.7139 89.5553 1111112 44 test_perf[cext] 84.7701 2830.4598 86.2741 9.6827 1006712 348 test_perf[simple] 328.9474 11456.5790 334.8236 41.8470 1195220 76 test_perf[django] 409.5238 17969.8413 417.4172 49.9735 1158302 63 test_perf[objproxies] 880.0000 31256.6666 923.1323 106.3637 1111112 30 -----------------------------------------------------------------------------------------
The slots and cext implementations are based on wrapt's code. I've named the pure Python implementation slots because that is the distinguishing implementation technique. And that was all I had in the beginning. I've wondered why Django's SimpleLazyObject is faster, by a significant margin even.
To find out what exactly is different I've made a primitive tracer:
import sys
import os
import linecache
from lazy_object_proxy.slots import Proxy
from django.utils.functional import SimpleLazyObject
def dumbtrace(frame, event, args):
sys.stdout.write("%015s:%-3s %06s %s" % (
os.path.basename(frame.f_code.co_filename),
frame.f_lineno,
event,
linecache.getline(frame.f_code.co_filename, frame.f_lineno)
))
return dumbtrace # "step in"
for Implementation in Proxy, SimpleLazyObject:
print("Testing %s ..." % Implementation.__name__)
obj = Implementation(lambda: 'foobar')
str(obj)
sys.settrace(dumbtrace)
str(obj)
sys.settrace(None) # we don't want to trace other stuff
And from that I've got:
Testing Proxy ... slots.py:122 call def __str__(self): slots.py:123 line return str(self.__wrapped__) slots.py:74 call @property slots.py:76 line try: slots.py:77 line return __getattr__(self, '__target__') slots.py:77 return return __getattr__(self, '__target__') slots.py:123 return return str(self.__wrapped__) Testing SimpleLazyObject ... functional.py:222 call def inner(self, *args): functional.py:223 line if self._wrapped is empty: functional.py:225 line return func(self._wrapped, *args) functional.py:225 return return func(self._wrapped, *args)
Essentially, the biggest difference is an extra function call (the __wrapped__ property).
Now I've thought to myself: I can do that too, using the cached property technique I could remove the second function call. But that trick needs a __dict__ - it can't work with __slots__. So I've proceeded to make an implementation that doesn't have that (the "simple" from the previous benchmark table). It was faster indeed but then I finally understood why Graham Dumpleton used __slots__ (while the tests started to fail).
Turns out he had replaced the normal __dict__ with a property [3], and proxying vars(obj) relies on having dict__ as a proxy property. In other words, you can't use vars on an object without a __dict__ (like most builtin types).
Interestingly enough, the implementation with __slots__ is much faster on PyPy [4]:
-- benchmark: 4 tests, min 5 rounds (of min 25.00us), 30.00s max time, timer: monotonic -- Name (time in ns) Min Max Mean StdDev Rounds Iterations ------------------------------------------------------------------------------------------ test_perf[slots] 2.1267 139.0987 2.3513 0.4176 1003345 13824 test_perf[simple] 24.0000 9981.7000 29.9561 37.2147 1250001 1000 test_perf[django] 25.1000 10186.4000 29.5746 26.3704 1195220 1000 test_perf[objproxies] 25.6000 9509.6000 30.2238 20.0922 1176471 1000 ------------------------------------------------------------------------------------------
Now I'm a bit broken up about this, which implementation should be the default? Should the simple one be the default on PyPy?
[1] | HTML output generated with ansi2html --inline --scheme=xterm. You can capture output with all the ANSI escapes codes by running script -c "command" output.txt. |
[2] | You can take a look at cookiecutter-pylibrary for now. |
[3] | See: wrapt/wrappers.py |
[4] | In case you're wondering what's with the different timer, the tests are done on PyPy (not PyPy3). That means no high precision timer, so I had to implement my own using clock_gettime(CLOCK_MONOTONIC) from __pypy__.time. |