No code is perfect ================== .footer: Ionel Cristian Mărieș .. role:: small .. role:: smaller ---- No code is perfect ================== .footer: `PyCon Balkan `_ — 5 Oct 2019 .fx: vcenter "We write code so good we don't need debugging" - ever heard this? | | I have, right before "we can't find enough developers". Presenter notes --------------- As if you: 1. have the money or the time to hire really good developers - that also never ever have a beer at lunch 2. build everything from scratch and not use a pile of buggy frameworks ---- No code is perfect ================== .fx: vcenter What's this about? ------------------ | * A quick debugging overview * Going deep on tracing Presenter notes --------------- Quick survey: * how many do debugging? how many way past bedtime? * how many are here because of the provocative title? ---- What can you use for debugging? =============================== * (classical) debuggers - :small:`funny name for something that gets reinvented every other year` :smaller:`pudb, rpdb, pdb++, pydb, xpdb, pdb-clone, pydbgr, ipdb, ptpdb, web-pdb, remote-dpb, pybug, winpdb, your IDE's debugger, gdb` * inspectors - :small:`shells basically` :smaller:`anypdb.set_trace(), pydbattach, pyringe, pyrasite, pystuck, manhole, pptop, wsgi-shell, werkzeug's debug wsgi middleware etc` * tracers/tracing - :small:`a tool or technique that allow recoding what the program is doing` * logging, print, fancier printers (icecream, peepshow, pysnooper/snoop etc) * profilers * execution listers (classical tracers) - `os level `_: strace, sysdig, systemtap, bpftrace etc .notes: who does that syscall? hard to tell - language level: stdlib trace, smiley, pytrace, birdseye, **hunter** ---- Why hunter? =========== * prints and even the fancy printers are still cumbersome: - you end up editing dozens of files - you either miss context or have to edit too much .notes: stdlib files are hard to edit, usually messy - forgetting to undo, side-effects from changes you forgot * inspectors are good but not always explain how the bug was produced * os level tracers are good (use them) but: - hard to connect the dots if you're not familiar with the code ---- Example ======= .fx: small Running this: .. code-block:: bash PYTHONHUNTER='module="posixpath", action=CallPrinter' python3 Gets you this: .. raw:: html
.. sourcecode:: pycon >>> os.path.join('a', 'b') [...]path.py:75 call => join(a='a') [...]path.py:80 line a = os.fspath(a) [...]path.py:81 line sep = _get_sep(a) [...]path.py:41 call => _get_sep(path='a') [...]path.py:42 line if isinstance(path, bytes): [...]path.py:45 line return '/' [...]path.py:45 return <= _get_sep: '/' [...]path.py:82 line path = a [...]path.py:83 line try: [...]path.py:84 line if not p: [...]path.py:86 line for b in map(os.fspath, p): [...]path.py:87 line if b.startswith(sep): [...]path.py:89 line elif not path or path.endswith(sep): [...]path.py:92 line path += sep + b [...]path.py:86 line for b in map(os.fspath, p): [...]path.py:96 line return path [...]path.py:96 return <= join: 'a/b' 'a/b' .. raw:: html
---- What else can it do? ==================== Builtin filtering (implemented with Cython, shouldn't be too slow): - fields: ``function``, ``module``, ``lineno``, ``stdlib``, ``arg``, ``locals``, ``kind``, ``filename``, ``source``, ``fullsource``, ``threadname``, ``threadid``, ``depth``, ``calls`` - expressions: ``Q(module='foo') | Q(module='bar')``, ``~Q(module='foobar')`` (``~`` means ``Not``) - operators: ``module_startswith='foo'`` all of them: ``startswith`` (``sw``), ``endswith`` (``ew``), ``in``, ``contains`` (``has``), ``regex`` (``rx``), ``lt``, ``lte``, ``gt``, ``gte`` - predicates: ``From``, ``When``, ``And``, ``Or``, ``Not`` (the ``&``, ``|``, ``~`` operators alias to those) ---- What else can it do? ==================== User filtering - just pass in your function/lambda: .. code-block:: bash PYTHONHUNTER='lambda ev: ev.locals.get("myvar") == 123' Few extra fields available: ``code``, ``frame``, ``globals``, ``locals`` | | | ---- What else can it do? ==================== .. code-block:: python hunter.trace(module='foobar') ... hunter.stop() Context manager: .. code-block:: python with hunter.trace(module='foobar'): ... | | | ---- What else can it do? ==================== Decorator: .. code-block:: python @hunter.wrap(module='foobar') def my_function(): ... Has a local mode: .. code-block:: python @hunter.wrap(local=True) def my_function(): ... | | | ---- Other features ============== - remote attach (gdb or manhole) - variable printing/diffing (``VarsPrinter``/``VarsSnooper``) - binary wheels for every os - jupyter notebook/ipython support - cython support (if you enable Cython's ``linetrace``) ---- As an exploratory tool ====================== Get a tree with just the calls/returns/exceptions: .. code-block:: python trace( ~Q(kind="line"), ~Q(module_in=["six","pkg_resources"]), ~Q(module_contains=["importlib"]), ~Q(filename=""), stdlib=False ) Breakpoints: .. code-block:: python trace( Q(lambda ev: ev.locals.get("host") == "localhost", action=Debugger), module="redis.client" ) ---- A practical use-case ==================== Ever seen code like this? .. code-block:: python try: error() finally: return "bwhahaha, have fun debugging this" | | | | ---- Silenced exceptions =================== .. sourcecode:: python def error(): raise RuntimeError() def silenced1(): try: error() except Exception: pass def silenced2(): try: error() except Exception as exc: pass return 'x' def silenced3(): try: error() finally: return "bwhahaha" ---- Silenced exceptions: solutions? =============================== Doesn't help:: grep -A 5 -R "except:" . .notes: Statical analysis too cumbersome as well - especially if you have to fix lots of things or too many false positives What if you could detect this with a tracer, instead of reading all the code? The plan: * show all exceptions * show what happens until function returns * but with a limit * no more than 10 events * don't drill into calls ---- S.E.: runtime analysis ====================== .. sourcecode:: python class PlainDumpExceptions(hunter.CodePrinter): should_trace = False depth = 0 count = 0 def __call__(self, event): self.count += 1 if event.kind == 'exception': # something interesting happened ;) self.should_trace = True self.depth = event.depth self.count = 0 self.output("{} tracing on exception: {}\n", ">" * 46, self._try_repr(event.arg[1])) super(DumpExceptions, self).__call__(event) elif self.should_trace: super(DumpExceptions, self).__call__(event) if event.kind == 'return': # stop if function returned self.should_trace = False self.output("{} function exit\n", "-" * 46) elif event.depth > self.depth + 1: # too many details return elif self.count > 10: # bail out on too many lines self.should_trace = False self.output("{:>46} too many lines\n", "-" * 46) ---- S.E.: better runtime analysis ============================= What's missing? Colors: .. code-block:: python self.output("{BRIGHT}{fore(BLUE)} ...", ... Skip cases where exception is reraised? * Buffer up all the events up to a return * If return is bad (not a reraise) dump all the events. ---- S.E.: better runtime analysis ============================= .fx: smaller .. code-block:: python class SmartDumpExceptions(hunter.CodePrinter): events = None depth = 0 count = 0 def __call__(self, event): self.count += 1 if event.kind == 'exception': # something interesting happened ;) self.events = [event] # not great, not terrible self.exc = self.try_repr(event.arg[1]) self.depth = event.depth self.count = 0 elif self.events: if event.kind == 'return': # stop if function returned if opcode.opname[event.code.co_code[event.frame.f_lasti]] == 'RETURN_VALUE': self.output("{BRIGHT}{fore(BLUE)}{} tracing {} on {}{RESET}\n", ">" * 46, event.function, self.exc) for event in self.events: super(SmartDumpExceptions, self).__call__(event) if self.count > 10: self.output("{BRIGHT}{fore(BLACK)}{} too many lines{RESET}\n", "-" * 46) else: self.output("{BRIGHT}{fore(BLACK)}{} function exit{RESET}\n", "-" * 46) self.events = [] elif event.depth > self.depth + 1: # too many details return elif self.count < 10: self.events.append(event) ---- Storing events ============== .. sourcecode:: python self.events.append(event) Has some problems: * Will hold reference to frame, local and global variables. Can create side-effects, eg: * Finalizers not called (files not closing, ``__del__`` not called etc) * Weakref callbacks not called Solution: strip event of "live" data like ``event.arg``, ``event.globals``, ``event.locals`` - who needs those? .. code-block:: python self.events.append(event.detach()) ---- Storing events: better ====================== Stripping event of "live" data: .. code-block:: python self.events.append(event.detach()) Has some problems: * call arguments are in ``event.locals`` * return value is in ``event.arg`` Solution: sanitize values by converting to string representations (we don't need anything else anyway): .. code-block:: python self.events.append(event.detach(self.try_repr)) ---- Lightweight tracing =================== The usecase: * big app, complete tracing would slow it too much * want to look inside few functions Solution: * monkeypatch the interesting functions to have a tracer around them ---- Lightweight tracing =================== .. code-block:: python def probe(qualname, *actions, **filters): def tracing_decorator(func): @functools.wraps(func) def tracing_wrapper(*args, **kwargs): # create the Tracer manually to avoid spending time # in likely useless things like: # - loading PYTHONHUNTERCONFIG # - setting up the clear_env_var or thread_support options # - atexit cleanup registration with hunter.Tracer().trace( hunter.When(hunter.Query(**filters), *actions) ): return func(*args, **kwargs) return tracing_wrapper aspectlib.weave(qualname, tracing_decorator) # module.func = tracing_decorator(module.func) probe('module.func', hunter.VarsPrinter('var1', 'var2'), kind="return", depth=0)