Skip to content

Ramping up on debugpy

Adam Yoblick edited this page Nov 2, 2022 · 1 revision

I don’t think there’s any kind of docs on tutorials on debuggers in general that would be particularly helpful here. In my experience, between working on ptvsd/debugpy, the mixed-mode debugger in VS, and on the R debugger in RTVS, it all tends to be very language- and runtime-specific, because they all have different hooks, and different tricks are necessary to implement the same features. So the only thing they really have in common is the debug adapter protocol used to talk to the client: https://microsoft.github.io/debug-adapter-protocol/specification

For Python specifically, I think it’s best to look at the relevant Python APIs first to get a general idea of how things work. That would be: https://docs.python.org/3/library/sys.html#sys.settrace https://docs.python.org/3/library/sys.html#sys._getframe The built-in “pdb” debugger in the Python standard library can be treated as a sample on how to use all this stuff; it’s a single-file module, and while it’s not small, it’s way smaller than pydevd.

Beyond that I find that I use the Python extensibility / C API docs a lot, mostly as a reference: https://docs.python.org/3/c-api/index.html

It’s also important to understand how CPython works under the hood – stuff like frame and code objects. This isn’t covered in much detail in the official docs, but there are some good blogs on the subject. Unfortunately, I can’t find the series that I used to learn myself, but here are some newer ones that give a decent overview: https://blog.sourcerer.io/python-internals-an-introduction-d14f9f70e583 https://tenthousandmeters.com/tag/python-behind-the-scenes/

Beyond the blogs, the main reference for under-the-hood stuff is CPython source code itself: https:/python/cpython It’s C, not even C++, so it tends to be very verbose wrt error and resource management, but overall it’s easier to follow than most C codebases. The parts that are relevant most often are the aforementioned frame objects (PyFrameObject) and code objects (PyCodeObject), and the bytecode interpreter loop in ceval.c.

Now, with respect to debugpy specifically, the first thing to keep in mind is that it’s effectively two distinct parts: pydevd, which runs in-process relative to the debuggee, and provides core single-process debugging functionality (breakpoints, stepping, stack traces, variables/watch); and debugpy proper, which runs mostly out-of-process, and handles stuff like launching, output redirection, and process lifetime (debugpy.server is the in-proc part; it wraps pydevd APIs into our own, so that we have control over our API surface). Thus, all the info above about Python debugging APIs and its internals pertains to pydevd.

For debugpy proper, we don’t have much developer documentation aside from comments in the code, although I tried to keep them extensive. There’s also a short doc and diagram that describes process management: https:/microsoft/debugpy/blob/main/doc/Subprocess%20debugging.md and a lengthy doc that explains the framework used to write debugpy tests: https:/microsoft/debugpy/blob/main/tests/timeline.md

Clone this wiki locally