Things to try with a tracer =========================== .footer: Ionel Cristian Mărieș ---- Things to try with a tracer =========================== .footer: `RoPython `_ Cluj — 20 Jun 2019 .fx: vcenter This talk's objective: * explain a bit the ideas behind the library * show the basics * show some fancy examples ---- Why Hunter exists ================= .fx: vcenter Early debugging * prints, patching code - tedious - inconvenient and messy when changes need to be applied to stdlib/libraries ---- Why Hunter exists ================= Second stage debugging: started seriously using celery ⇒ deadlocks, memory leaks * strace - not enough context in strace (who does that syscall? hard to tell) * gdb - practically only useful for seeing the current stacktrace - fussy (needs debug symbols, debug builds of python) * prints (still), manual monkeypatching - still tedious ---- Why Hunter exists ================= Second stage debugging: started seriously using celery ⇒ deadlocks, memory leaks * manual tracer | .. code-block:: python def mytracer(frame, event, args): print(frame.f_code.co_filename, ... import sys sys.settrace(mytracer) | - slow as hell - still tedious (messy code, inconvenient to install) ---- Why Hunter exists ================= Third stage debugging: tool smithing (circa 2014) * ``aspectlib`` (usecase: deadlocks) - fancy-pants monkeypatching to replace the previous tedious patch code | .. code-block:: python aspectlib.weave('absolutely.anything.anywhere', aspectlib.debug.log, subclasses=True, methods=NORMAL_METHODS, lazy=False, aliases=True) | - was too brittle (what to patch? which patching options? ⇒ non-portable configuration) * ``manhole`` (usecase: memory leaks) - robust debugging shell (inspired by ``twisted.manole`` but no big dependencies, no insecure stuff) ---- Why Hunter exists ================= Fourth stage debugging: circling back to tracers (circa 2015) * ``hunter`` (replacement for the fussy ``aspectlib``) - tracer toolkit made to: * avoid dealing with the boring parts (getting the ``sys.settrace`` api right, figuring out all the ``f_``/``co_`` fields, printing stuff etc) so you can focus on that nasty bug * not slow down your app a lot * have easy setup (env var activation, remote attach) * try really hard to not have side-effects ---- Hunter - some quirks ==================== The defaults: * Variables are displayed with a custom ``repr`` to avoid side-effects * ``threading_support=False`` It's verbose ("function" over "func", "CallPrinter" over "Print") * clarity (not to confuse with the ``print`` builtin) * extensibility (initially there was ``CodePrinter``, then ``CallPrinter`` was added) * helpers/brief api layer can be added later ---- Hunter - example ================ .fx: small Running this: .. code-block:: bash PYTHONHUNTER='module="posixpath", action=CallPrinter(filename_alignment=12)' python3 Gets you this: .. sourcecode:: pycon >>> os.path.join('a', 'b') [...]path.py:75 call => join(a='a') [...]path.py:80 line a = os.fspath(a) [...]path.py:81 line sep = _get_sep(a) [...]path.py:41 call => _get_sep(path='a') [...]path.py:42 line if isinstance(path, bytes): [...]path.py:45 line return '/' [...]path.py:45 return <= _get_sep: '/' [...]path.py:82 line path = a [...]path.py:83 line try: [...]path.py:84 line if not p: [...]path.py:86 line for b in map(os.fspath, p): [...]path.py:87 line if b.startswith(sep): [...]path.py:89 line elif not path or path.endswith(sep): [...]path.py:92 line path += sep + b [...]path.py:86 line for b in map(os.fspath, p): [...]path.py:96 line return path [...]path.py:96 return <= join: 'a/b' 'a/b' ---- As an exploratory tool ====================== Get a tree with just the calls/returns/exceptions: .. code-block:: python trace( ~Q(kind="line"), ~Q(module_in=["six","pkg_resources"]), ~Q(module_contains=["importlib"]), ~Q(filename=""), stdlib=False ) Breakpoints: .. code-block:: python trace( Q(lambda ev: ev.locals.get("host") == "localhost", action=Debugger), module="redis.client" ) ---- Silenced exceptions =================== .. sourcecode:: python def error(): raise RuntimeError() def silenced1(): try: error() except Exception: pass def silenced2(): try: error() except Exception as exc: pass return 'x' def silenced3(): try: error() finally: return "mwhahaha" ---- Silenced exceptions: solutions? =============================== Doesn't help:: grep -A 5 -R "except:" . What if you could detect this with a tracer, instead of reading all the code? The plan: * show all exceptions * show what happens until function returns * but with a limit * no more than 10 events * don't drill into calls ---- S.E.: runtime analysis ====================== .. sourcecode:: python class PlainDumpExceptions(hunter.CodePrinter): should_trace = False depth = 0 count = 0 def __call__(self, event): self.count += 1 if event.kind == 'exception': # something interesting happened ;) self.should_trace = True self.depth = event.depth self.count = 0 self.output("{} tracing on exception: {}\n", ">" * 46, self._try_repr(event.arg[1])) super(DumpExceptions, self).__call__(event) elif self.should_trace: super(DumpExceptions, self).__call__(event) if event.kind == 'return': # stop if function returned self.should_trace = False self.output("{} function exit\n", "-" * 46) elif event.depth > self.depth + 1: # too many details return elif self.count > 10: # bail out on too many lines self.should_trace = False self.output("{:>46} too many lines\n", "-" * 46) ---- S.E.: better runtime analysis ============================= What's missing? Colors: .. code-block:: python self.output("{BRIGHT}{fore(BLUE)} ...", ... Skip cases where exception is reraised? * Buffer up all the events up to a return * If return is bad (not a reraise) dump all the events. ---- S.E.: better runtime analysis ============================= .fx: smaller .. code-block:: python class SmartDumpExceptions(hunter.CodePrinter): events = None depth = 0 count = 0 def __call__(self, event): self.count += 1 if event.kind == 'exception': # something interesting happened ;) self.events = [event] # not great, not terrible self.exc = self.try_repr(event.arg[1]) self.depth = event.depth self.count = 0 elif self.events: if event.kind == 'return': # stop if function returned if opcode.opname[event.code.co_code[event.frame.f_lasti]] == 'RETURN_VALUE': self.output("{BRIGHT}{fore(BLUE)}{} tracing {} on {}{RESET}\n", ">" * 46, event.function, self.exc) for event in self.events: super(SmartDumpExceptions, self).__call__(event) if self.count > 10: self.output("{BRIGHT}{fore(BLACK)}{} too many lines{RESET}\n", "-" * 46) else: self.output("{BRIGHT}{fore(BLACK)}{} function exit{RESET}\n", "-" * 46) self.events = [] elif event.depth > self.depth + 1: # too many details return elif self.count < 10: self.events.append(event) ---- Storing events ============== .. sourcecode:: python self.events.append(event) Has some problems: * Will hold reference to frame, local and global variables. Can create side-effects, eg: * Finalizers not called (files not closing, ``__del__`` not called etc) * Weakref callbacks not called Solution: strip event of "live" data like ``event.arg``, ``event.globals``, ``event.locals`` - who needs those? .. code-block:: python self.events.append(event.detach()) ---- Storing events: better ====================== Stripping event of "live" data: .. code-block:: python self.events.append(event.detach()) Has some problems: * call arguments are in ``event.locals`` * return value is in ``event.arg`` Solution: sanitize values by converting to string representations (we don't need anything else anyway): .. code-block:: python self.events.append(event.detach(self.try_repr)) ---- Lightweight tracing =================== The usecase: * big app, complete tracing would slow it too much * want to look inside few functions Solution: * monkeypatch the interesting functions to have a tracer around them ---- Lightweight tracing =================== .. code-block:: python def probe(qualname, *actions, **filters): def tracing_decorator(func): @functools.wraps(func) def tracing_wrapper(*args, **kwargs): # create the Tracer manually to avoid spending time # in likely useless things like: # - loading PYTHONHUNTERCONFIG # - setting up the clear_env_var or thread_support options # - atexit cleanup registration with hunter.Tracer().trace( hunter.When(hunter.Query(**filters), *actions) ): return func(*args, **kwargs) return tracing_wrapper aspectlib.weave(qualname, tracing_decorator) # module.func = tracing_decorator(module.func) probe('module.func', hunter.VarsPrinter('var1', 'var2'), kind="return", depth=0)