Debugging with AOP

17 March 2014 (updated 17 February 2016)

Suppose you were doing some sockets programming and had a strange bug - the code seemed to be stuck. Imagine something like this:

>>> s = socket.socket()
>>> s.connect(('google.com', 80))
>>> fh = s.makefile()
>>> fh.write('GET / HTTP/1.0\r\n\r\n')
>>> fh.readline()  # never returns

The problem is that the fileobject does buffering by default. The write call doesn't actually send anything unless we add fh.flush() around or change the buffering to be line-based (makefile(bufsize=1)) or completely disable it (makefile(bufsize=0)). Because HTTP is line-based protocol line buffering is best, example:

>>> s = socket.socket()
>>> s.connect(('google.com', 80))
>>> fh = s.makefile(bufsize=1)
>>> fh.write('GET / HTTP/1.0\r\n\r\n')
>>> fh.readline()
'HTTP/1.0 302 Found\r\n'

I don't like this non-linear learning curve and there are other situation where you don't have documentation around to figure out why stuff doesn't work. Sometimes, there's too much code to look at and you're left with a stuck server (or client) application.

How to figure out causes like this? *

One might think, just use strace to figure out what syscalls are being made. I think that's quite inflexible. Most of the time you need more data to identify the cause of the bug.

A lot of times I've found myself littering code with logging calls and patching the socket class just to figure out what's going on. I've set out to make this easier to do - and it turns out this is a perfect application of aspect-oriented programming: adding additional behavior to existing code.

Let's try to "debug" the code above:

>>> from aspectlib import weave, debug
>>> weave(['socket._realsocket', 'socket._fileobject'],
...       debug.log(print_to=sys.stdout, stacktrace=False),
...       lazy=True)
<aspectlib.Rollback object at 0x16949b8>
>>> s = socket.socket()
>>> s.connect(('google.com', 80))
{socket._realsocket}.connect(('google.com', 80))
{socket._realsocket}.connect => None
>>> fh = s.makefile()
>>> fh.write('GET / HTTP/1.0\r\n\r\n')
{socket._fileobject}.write('GET / HTTP/1.0\r\n\r\n')
{socket._fileobject}.write => None
>>> fh.readline()
{socket._fileobject}.readline()
{socket._realsocket}.recv(8192)

We can immediately see that fh.write didn't do anything - it's clear that the fileobject has some buffering on. It's just a matter of identifying the correct makefile invocation. Now this is much better than doing guesswork don't you think?

Give aspectlib a shot!

This entry was tagged as aop debugging logging python