This is a bit of a weird article. It spends most of the time talking about logging, which is somewhat useful for debugging but not really. pdb gets a few lines of description, and that is about it.
Personally, I can't live without Pycharm when working with Python purely because of how fantastic the debugging experience is. The integration with the interactive IPython shell is simply fantastic, and the live variable viewer is also really handy. Between being able to prototype code in the interactive IPython REPL and then graduate it seamlessly to scripts/functions/classes, and then having an amazing debugging UX to fallback on for those bugs that slip through, I find myself to be orders of magnitude more efficient than when coding in something like vscode (which I absolutely love, and want to transition to for Python coding, but it is extremely limited in comparison to Pycharm, especially when it comes to the REPL experience + debugging).
I second PyCharm, not only from a debugging perspective but as a whole. I don't understand people struggling with generic text editors, development oriented text tools (sublime etc), or even vscode just because they are free when there is a much better tool money can buy (and PyCharm has a free, community version as well).
Anecdotal, but I have never seen any of my colleagues with PyCharm work more quickly than me in Sublime Text. I think if you "struggle" in an IDE or editor, it's not really the IDE or editor that's the problem. A "jump to definiton" key and plaintext search tools (I use grep or ag in the command line) tend to be enough for me 99% of the time.
I also think there's something to it being forced to think about your code rather than just relying on your IDE. In the past I saw random functions getting extracted to random modules and imported everywhere because PyCharm makes it so easy to do so, with no real thought as to where it belongs, which led to some weird circular import error happening pretty much every month.
I work with data scientists and ML people more than hardcore developers, but I've noticed that a lot of people who use sublime text tend to resort to print() debugging.
IMO python's lack of explicit typing makes it difficult to reason about by inspection alone ("does foo() return a dataframe or a numpy array?!"). For me at least, I need to get into the guts of a system and watch it execute to really understand it. The print() debug crowd tend to be much better than me at reasoning about code by inspection alone, but when you work on something complex that you didn't write yourself, that only gets you so far.
Part of it is that the print() debug crowd tends to prefer to externalize more of the reasoning into the code itself (rather than into their tooling, or into their brains—though sometimes it is the latter, and that can be a disaster coming back a few months later).
That doesn't work great on other people's code, obviously.
Can you recommend anything to learn debugging the proper way (or maybe the article here is a good resource)?
I might be one of these print() people- usually when I hit bugs I read through the stack trace and can figure it out, but if it's more of an "unexpected result" I resort to print() so I know exactly what I'm doing. Would love to learn a more efficient way
For me the next step after print debugging was adding an `import pdb; pdb.set_trace()` shortcut to my editor (I use "pdb<tab>" as the shortcut, but whatever works.) Do that whenever you would add a print statement and re-run to resolve a confusing bug. It drops you into a REPL at that point in the code, and you can then print values and try running things. It basically just collapses a series of "add print statement and run" into a much tighter loop, which can save a ton of time depending how long it takes your program to throw the error and how many print statements it was going to take you to figure out what was wrong.
Key commands I use while in pdb: ? for help; n to run the next line; s to step into the next function call; w to see where I am in the stack; u and d to go up and down the stack; unt to run until a new line is reached (if you're stuck in a loop); interact to go to a normal python shell (if the pdb shell is confused by what you typed).
This won't replace print statements 100% of the time, because it might be simpler to print a bunch of logs to see where the problem is than to step through the program in the debugger; use it when you know where the problem probably is but not what it is.
It depends a lot on whoch platform you're working on. I do a lot of frotend stuff, which means I have an excellent debugger available in the browser. Having one in my editor doens't buy me much.
I do tend to reach for a simple log first though. I find it's ofte quicker to place a bunch of log statments than step through every statment. The debugger comes out for more complex issues.
My main reason for wishing to be able to switch to Vscode instead of Pycharm is because vscode's remote development UX is really really nice and works a lot better for me compared to the Pycharm way of SFTP-ing files between your local and the remote.
My use-case where Pycharm feels a bit deficient is that I have an Ubuntu deeplearning box at work that I can VNC into and develop on whenever needed. I have all my repos cloned locally and work on various projects locally (and Pycharm is my IDE of choice on the Ubuntu box). But when I work from home, VNC feels a bit too sluggish and remote development would be much better for me. The vscode experience is pretty fantastic. I can setup SSH access to my box and have an open project for the same repos that I have cloned locally on my Ubuntu box and it's as if I am working on it natively. Everything is super snappy and even my bash sessions have the correct remote venvs activated correctly.
In contrast, every time I tried mimicking this in Pycharm (the paid version), it seemed like there was no way to link my already cloned repos on the Ubuntu box to my local version on my laptop so that work I did locally would always also be "live" on my Ubuntu box so that were I to VNC into my Ubuntu box and open the same project in my local Pycharm IDE, it would exactly mirror the state of my Pycharm IDE on my laptop. Is what I want possible? Everything I tried seem to result in my laptop Pycharm setting up a new project somewhere on my Ubuntu box where it would SFTP my project files to, but that resulted in a duplicate of my repo in some other location, which is not what I want. Hopefully this made sense?
> vscode's remote development UX is really really nice
I'm very interested in this, so looked it up.
For your use case, I think it's just a matter of setting the correct Deployment Mappings. There's a setting when you're creating the remote interpreter to specify your own mapped directory (not the default /tmp/pycharm-foo) and a checkbox to not have it automatically upload what you have on your local machine (since in your case the source of truth is the remote). Then you can manually sync the diffs yourself with pycharm or git.
Your workflow's interesting in that the remote machine is the main and your laptop's the satellite, and pycharm's model has it the other way around.
Also, if you've already got the interpreter set up, you can just change the deployment mapping settings.
My one current gripe about PyCharm is its seemingly lack of support for git submodules. As in I don't see any way to do a pull or push or work with branches on a submodule. I resort to doing all of my git work on the command line (which I don't mind - it's how I learned to work with git from the beginning).
> We're very aware of how important this feature is, and we have several initiatives in place to provide a great experience for fully remote development. Unfortunately, at this time we're unable to share any specifics or timeframes, as getting the best possible experience requires massive-scale architectural changes to our codebase, but we've been working on this for quite some time already and we believe that we will have a solution that will go above and beyond what VS Code delivers today.
Glad to know I'm not the only one. I still have the issue but I do still use pycharm (I do docker local and then just go from there - but my preferred was always basically "remote" develop - I used to use that so if I was on my laptop or desktop etc I was always on the same (linux) machine even if coming in from mac or windows.
Thanks for that link. Glad to see that it is prioritized on their roadmap, and also to know that there is no point in my trying to work around their current implementation because it seems it just won't work the way vscode can work seamlessly.
I have the same use case as you and went through exactly this. I ended up feeling that Pycharm’s implementation wasn’t really a suitable solution. Some others had mentioned the alternative of mounting the remote filesystem locally using SSHFS but I found that this created issues of its own. In the end I settled on using Syncthing to sync all of my folders and then just run a local instance of Pycharm (and VS Code) at either end (Ubuntu 20 desktop and Macbook Pro). I’m very happy with this setup so far and would happily recommend it.
Not the same. sshfs is acceptable if you only want to do basic file editing, but is unusable if you want to say use language services or search in your directory.
What you want is for basic text editing to run locally (to minimize latency) and heavy duty services (like language services or search) to run on the remote server where accessing a file only requires a file system call instead of a network call. Then, after gathering the required information it can send that over some kind of pipe to your text editor.
This is the workflow that VSCode has set up and it's very good. In many ways, even if you're used to vim, VSCode's setup is preferable to vim over ssh shell due to lower latency.
I would be happier if python Language Server would actually be better than the tool chain supplied by PyCharm, however they haven’t yet reached that level at all.
Yeah, it’s an efficient setup that doesn’t actually work because it’s lacking in features.
I used the Jetbrains suite for a year or two but I switched back to sublime last year. I have a CLI-centric workflow and I switch between 5-6 different languages, sublime is fast and I can edit whatever languages I want in the same editor. If I want a particular IDE feature there's a good chance I can Ctrl-Shift-P -> Install it with Package Control. I use pudb to debug Python, it works just fine, it lets you drop into an IPython shell too.
But I admit, I work on a weird product stitched from different languages, there's just no IDE that handles that particularly well as far as I know. If I was back to working on Python only I might switch back to PyCharm.
I've tried it, and watched many a colleague "struggle" with it. "First class" is definitely an over-statement from my perspective, even if it might be great compared to what you'd get with plain-text editors. It's type-hinting half the time behaves oddly even with type-hinting present.
But it is getting better, that's for sure. Looking forward to see upcoming iterations.
My goto for Python debugging is always Visual Studio since I'm coming from a Windows C# and C++ background. I find the Python debugger super easy to use if you're already familiar with that environment.
Early on I used Eclipse plugin for similar reasons - unified debugging across multiple platforms.
I used PyCharm for a while, and got frustrated at all the tons of things that it tries to do "smartly". There is a shitton of buttons everywhere, and I ended up using breakpoint() anyway because I couldn't figure out how the debugger works properly -- and it doesn't always work (e.g. it didn't break when I was using GPU, for some reason). I'm sure it's excellent, but I just can't be bothered learning all of that. Same for git/github GUI, I prefer atom's, it's so much easier to use.
I'm a vscode user and I do like it. I only really use it for the file navigation and color coding though. Autocomplete drives me nuts, so I turned that off completely. I am a pdb fan so i drop in to that when needed. I never really considered PyCharm to be honest, it feels like overkill for me.
However your allegiance sounds interesting. Can you give some reasons why you like it so much?
Not the person you replied to, but I love vscode from when I did a bunch of React + JS stuff a few years back. I also do a ton of software development and machine learning work in Python now and have never managed to move away from Pycharm even though I'd love to be able to have a good enough experience in Vscode so that I could make the switch from Pycharm (the remote development experience in vscode is really neat which is what attracts me).
In no particular order, here are all the things that I love about Pycharm and that Vscode doesn't really compete on at the moment:
1. Integration of IPython shell (if you install IPython in your project venv) with Pycharm is absolutely amazing. Copy-pasting multi-line code just works. It handles all the auto-indent stuff. You have keyboard shortcuts to directly send highlighted sections of code in the editor to the shell. You can even add a plugin (Python code cell) to make the editor work like Spyder/Matlab's cell mode
2. Excellent history browser of all commands entered in the IPython shell. This is extraordinarily handy as I end up doing all my code prototyping directly in the IPython REPL, fix all the edge cases and issues interactively, and then just grab the things that work and graduate them to a script/function/class. This reduces the overall time to code things up and I end up with far less bugs.
3. The code completion/intelligence is unbelievably good. Vscode's intellisense is really really bad in comparison when it comes to Python. With Vscode, the tab completion almost always feels nonsensical with the ordering of things making no sense, and a lot of times just plain wrong things being suggested in tab completion. By contrast, Pycharm's code completion is really smart, and extremely helpful, especially when filling in method signatures, or trying to tab-complete instance attributes/methods. It is smart enough to even parse type-hinting from docstrings and provide code completion accordingly, which vscode fails at.
4. Refactoring is a lot easier. Code inspection tools are pretty powerful
5. Related to the fantastic REPL experience in Pycharm, the debugging experience is hands-down the best Python debugging experience I have found. Also, the integrated variable viewer/explorer is really really nice and makes debugging that much easier/faster
6. PEP8 linting + auto style refactoring is really good in Pycharm. I've found vscode lacking considerably in that department. Pycharm ends up catching a lot of silly mistakes that vscode doesn't (example: forgetting to have your first param be `self` for an instance method).
7. Managing your project venvs and dependencies is also much easier and more robust in Pycharm. Pycharm will alert you the moment any file utilizes imports not installed in your venv and will also highlight missing dependencies in your requirements.txt / setup.py
I'm sure I'll think of more, but these are some of the top things that come to mind. It is a really fantastic product and what's crazy is that all of the above you get with the free community edition! I actually want to purchase the pro version but haven't found any need for it. I was interested in the remote-code development capabilities but unfortunately have been pretty underwhelmed by that aspect of Pycharm. Vscode seems to have a much better UX on that front imho.
I have used both softwares but stick with VSCode for Python, simply because it is very powerful to have one great editor for all projects, as opposed to one specialized IDE for each.
PyCharm is (AFAIK) monolithic in what it brings to the table, whereas VSCode is more modular. As such, some of your points are true, but not an inherit fault of VSCode, but probably of some settings/underlying tools used. Maybe these can get you a better experience:
3. Agreed with VSCode being lackluster, but switching to MS Language Server [0] has helped this a great deal (while having other downsides, like being slower).
5. VSCode also has a variable viewer/browser and an interactive shell (which cannot be IPython as of today, however [1])
6. Disagree: I forgot what VSCode uses per default, but I switched to `black` for formatting, which works very well (` "python.formatting.provider": "black"` in settings.json). For linting, I switched to `mypy` (which also handles type hints, which is helpful), which can be set via `Python: Select Linter` (CTRL+SHIFT+P). `mypy` seems to be powerful. It definitely catches method 1st arguments not being `self`.
5. They have a variable viewer, but as far as I can tell, it only works when in debugging mode? Not if you are prototyping code in a REPL, which is my workflow when developing code 99% of the time. It is also a bit limited compared to Pycharm. Vscode also does not handle multi-line interactive code nearly as well when working in the REPL in the debugger.
6. Yup, it supports linting, but my point was that Pycharm has better overall support for linting/code-intelligence in that it is better at auto-fixing things or having shortcuts to do so, without having to resort to something as extreme as `black`. It handles multi-line strings, or just general formatting of multi-line stuff perfectly. If I hit enter after `long_var_name = `, it will automatically put in a back-slash so you can continue writing a multi-line expression. I edit in both vscode and pycharm and I usually will still find useful things that the Pycharm linter catches or code-formatting things that Vscode does not (and I have tried a dew different linters though perhaps I should give mypy a shot).
I raise all these as a big proponent of vscode. Nothing would make me happier than to switch to it for all my Python development work as I enjoy the UX a lot more over Pycharm in the code writing/editing side of things. But Pycharm is just superior in so many ways that improves my productivity that I have not been able to make the swap despite several attempts.
re: 3, for me another killer feature is syntactically aware selection (alt-up/alt-down). Whatever symbol you're on, alt-up will select increasingly large scopes: symbol -> expression(s) -> statement(s) -> method/function(s) -> class -> file. I feel pretty crippled without it.
Pycharm doesn't work well for remote editing (ie when files are only stored on a distant server, not on your machine and synchronized regularely).
It is very slow over sshfs for instance, whereas other editors work very well (sublime, vscode..). Also, vs code has a very good remote development plugin.
I'd love to use pycharm, but my current project involves about 85% work on a raspi 4 and while it is quite impressive in many respects, it's just a little less than what pycharm requires in terms of resources.
I write Python with an enhanced text editor every day and don't struggle in the slightest. It's lightening quick as well. I pull out a proper debugger about twice a year.
PyCharm is an open-core editor. The FOSS edition is a second-class citizen. I make a
point not to run any proprietary software on my machine besides occasional
whitelisted JS, firmware, and patent-encumbered media codecs (which still have free
implementations).
Neovim's master branch has a built-in Language Server Protocol client and Treesitter
implementation, both of which have pre-made configs for Python development. Jumping
between symbols, viewing floating documentation/signatures/git-blame, asynchronous
completion, diagnostics, etc. are all readily available with a fraction of the
battery/CPU/memory usage on a fully-FOSS platform that works on even more platforms
(BSD, over SSH, etc). Jumping between symbols and listing usage of an object in a
project with hundreds of files has never been faster. All this starts up in
milliseconds. Being able to use an editor advanced enoguh for a large project but
lightweight enough to launch with a keybind just to write the HN comment is amazing.
Here are some screenshots of the LSP and treesitter in action:
Neovim also has FZF integration to fuzzy-search thousands of symbols, files,
reference, etc. Since it's the same FZF that I use everywhere else underneath, the
interface is quite familiar.
TLDR: Neovim is a lightweight text editor that integrates with external tools to
transform itself into an IDE.
There is a divide: Some people love debuggers, some people think placing print statements is the best way to debug code. You are reading an article from the second camp.
Log files have the benefit of being able to adjust the signal-to-noise afterwards, I guess that is why it is highlighted.
I was impressed that almost everyone in the book Programmers at Work used print statements to debug. It changed my mind about using it for debugging code.
I'm on the second camp of that divide as well. Most times, all you really need is the logging functionality (whether you get it from a logger or a debugger) to observe the outcome of an action. It's not really necessary to witness unfolding before your eyes the step-by-step playbook of how a variable came to acquire the wrong value. It's enough to see that it got the wrong value and from there you can infer what the problem is and the corrections to apply. That's most times.
However, some intricate situations are better suited for the debugger (flags set in inner loops, circuitous algorithms, values unknowingly inherited, values mistakenly overridden, etc). In those it usually is harder to reason about the flow of the logic. In my experience that's where a debugger shines, as it shows you what the code is doing. But even then and with enough experience and instinct, you can get by with just logging a value and reading the code.
On my own code I don't need debuggers, that is the scenario that Linus is talking about -- is Robert C. Martin a renowned programmer? :-)
For other people's Python code, especially if they mistakenly think that Python is Haskell, I do need a debugger. I've seen "scientific" Python code that is iterators all the way down and can only be understood by running the program.
Of course no actual science is being done with such programs, but that doesn't seem to matter.
I really don't agree that it is an either-or proposition. As another person replied below, there are things that a debugger enable that you simply are not afforded via print statements/logs.
From an efficiency standpoint, it would be hard to argue that print based debugging is faster. You can do everything you want via print debugging using a good debugger like in Pycharm or even ipdb, and without having to edit any lines of code (at least with the former). Meanwhile, the converse is not true... you are limited to a lot less with print-based debugging.
Furthermore, having tools like a variable viewer, or your editor window automatically showing variable values while debugging (as is the case with Pycharm) makes it a lot easier to reason about code as you debug and I am always much faster at working through issues when I go that route than if I go with print statements or even pdb without all the extra convenience I get from something like the Pycharm debugger.
And finally, it drives me up the wall to see sloppy code from some colleagues with tons of commented out lines like:
`# import pdb; pdb.set_trace()`
littered all over the place. It's both sloppy, and also really annoying when there are random changes in your git-history that are only because these lines keep popping in and out as they try to debug things.
There really is no logical argument. People have their preferences for a bike or a car but the car is faster than the bike just like how the debugger is generally better then printing. There is very little logical argument for the alternative.
Taking a snapshot of program state using log is isomorphic to freezing the program at that same state. The difference is that with the debugger you have the option to step into the next state or increase your resolution and look at state within a lower level module/function. With the debugger you have access to ALL state at the point where you froze it, not just statements you happened to log.
With logs, the way you debug is you log some state, reset the program with more logs and iterate until you find an issue. With a debugger you freeze the program examine all the state .... advance the program however far you want.... then the issue is found (unless you advanced the state too far ... but with logging the state ALWAYS advances too far until the end of execution, so with logging you always need to reset).
Logically, logging is just debugging with less features so people who prefer logging just in general don't want to use the extra features. Additionally logging has the added downside of constantly polluting your pristine code with your log annotations which you have to remove later (can introduce more bugs). While debugging with console PDB has the same effect, a GUI largely alleviates this problem.
The only argument for logging/printing is that you can use logging to debug an application in production.
I find that print debugging forces me to think through the execution and understand what my program is doing in my head. By having the restriction of not being able to change my print statements without restarting the program, I'm forced to understand the program better. Some restrictions foster creativity.
I will add that I usually add various levels of debug log annotations even before I run into a bug so that I know what my program is doing.
Also, when I'm on an embedded platform that can't run a debugger or writing an exploit for a security vulnerability, being good at logging is nice.
I completely agree and was just about to make a post comparing logger/debugger to bike/car. I guess we think alike!
There are also quite a few bugs where you need some kind of debugger in order to track down the bug. I am thinking crash dumps (using WinDbg for example) as well as memory corruption in C/C++. Being able to set a breakpoint when a certain address in memory changes has basically reduced most memory corruption bugs from a week-long nightmare to more of an inconvenience for me (mind you, it's still my least favourite type of bug to investigate).
Yeah I agree entirely on Pychaem debugger being awesome. But it’s no substitute for logging in production. Somethings just need to be logged, either for proof or those situations where you just can’t reasonably attach a Pycharm debugger.
Yes, logging is crucial for knowing what happened, but if you don't understand how it could have happened, step by step debugging is immensely useful. Assuming, of course, you can reproduce in your dev environment.
Oh yes, absolutely. I didn't mean to say logging isn't useful. It is vital to have good logging for production because like you said, you aren't just going to be able to attach a debugger. But for an article titled "Guide to Python Debugging", logging should be a short paragraph, not the core of the article imo. Debugging in production on the other hand... then logging makes a lot of sense to focus on.
I tried pycharm once and every time i started it up it had to index files for hours. Anybody got around that problem? I really like IntelliJ so I would love to give pycharm a try.
You want to make sure PyCharm is only indexing files within your project. Depending on your setup, it can try to index all files within virtual environments etc.
> Between being able to prototype code in the interactive IPython REPL and then graduate it seamlessly to scripts/functions/classes, and then having an amazing debugging UX to fallback on for those bugs that slip through,
Do you have a good tutorial on how this all ties together? It sounds like heaven but it's not how I use PyCharm because I don't know any better.
No link to any tutorial, it's just how I've always coded and found myself to be the most productive and produce the least bugs. A brief description is:
1. Open up the Python interpreter (highly recommend installing iPython in your project venv so you can use it as the default interpreter in Pycharm)
2. Setup toy data to prototype the thing you are trying to do (or just load in some data or a pickled file, etc.)
3. Start prototyping the functionality of your code one step at a time. If things need to be in a loop in your main script: eg: `for thing in things:` , I just set `thing = things[0]` in the REPL and then continue prototyping the logic of what goes in the inner loop.
4. Use the variable viewer, printing variables in the REPL, etc to make sure things are doing what you want
5. As sequences of statements are verified as doing what you want, you just "graduate" them to your script so you in effect know that they are somewhat tested and do what you wanted
6. Get to a point where you get the right result you were looking for
7. Now set `thing = things[4]` or some other edge-case and just run it a few times to make sure you still get the right result.
8. You're good to go. Wrap up the code in whatever function or class and you are ready to give it a shot for real
This sounds like more work, but it is super easy to execute code in the interpreter that you have in your editor (select code in script window, ALT + Shift + E to execute in interpreter)
Now the thing people might say is - Well, it is hard to always setup stuff with toy data, and that's true. In that case, you just set a debugger at the point where you would be writing your new function/method, and when you pause there, launch the interactive python interpretter within the debugger and you can start prototyping code from scratch in exactly the same way above, but in the debugger, with actual data you care about.
I find that you eliminate all the dumb/idiotic bugs in code this way and the only ones that I need to actually debug at times are certain edge cases I did not account for at the time of prototyping.
I agree. After criticizing print statements, more than half of the article discusses logging, which is a little more organized print. And I'm not disagreeing - it's invaluable in production systems.
I wish there was more to how to actually debug in Jupyter or Spyder, and I'm relatively new to both, and was hoping for some tips on that.
I'm curious - how do you use pdb when debugging, say, controllers on web functions? I guess you can try and debug underlying functions using pdb, but i'm not sure how it happens (short of logging) once you're means of interaction is an HTTP call, or a response (e.g., to a trigger/message)
Run the application locally in a terminal and just use pdb normally. To do it to an application running in a container is slightly more involved but still possible.
Nice article. In particular the -i flag seems really useful.
I'd like to add two packages I learned about recently that I like A LOT: pyrasite and PySnooper.
Use pyrasite to "attach" to an existing running Python process (i.e. you forgot to instrument the app, but it has a bug in it, how can you debug?)
(1) Install pyrasite
(2) Get the PID of the running process
(3) Run `pyrasite <pid> dumpstackz0.py` where dumpstackz0.pycontains `import traceback; traceback.print_stack()`
(4) The stacktrace will be printed in the stdout/stderr of the running application
I've been a Python developer for less than 10 hours. Approximately 98% of the time debugged something was in Visual Studio. Quite a nice experience. I was lucky to have the project already set up for me.
This has been my go-to debug method for years now. Find a critical section of code, drop the embed call there (with maybe a print statement or two leading up to it for context).
For stepping through code, I agree with your other comments that it is often too much for debugging. To recreate it when really necessary, I might sprinkle a handful of embed calls
Great article, especially nice points about __repr__ helping logging.
PDB as mentioned is great, but IPDB is at a whole 'nother level. If I'm able to get an ipdb breakpoint set in a tricky part of python code, I'm set. Tab-completion, some saved history, normal debugger operations, it's a dream-come true. I often set breakpoints and code iteratively w/ a repl-like interface or explore unfamiliar functions and objects.
To extend on inspecting stack traces: I found that I often wanted to know the the content of (relevant) (local) variables of a stack trace. Whenever I got a stacktrace, I would often restart it in a debugger, or add print statements on variables. Until I extended the stacktrace to just include this information. The tricky part is that printing all local variables is too much. And also global variables might be involved. Or attributes (e.g. self.x or so). So now I'm parsing all variables and attributes from the source code line from the traceback (in a very simple way) and print only those. This covers about 95% of all cases - i.e. in 95% of the case, the stack trace contains all information I need to understand and fix the problem.
I also often have a SIGUSR1 handler which will print the stacktrace of all threads. This is useful on long running processes, involving multi threading, where you might run into some strange hangs or deadlocks.
In addition to that, if you might get crashes (segfault or so), something like faulthandler is useful.
Don't forget breakpoints! Learn how to move around in the debugger, it will be a LIFESAVER.
Above a troublesome line of code, drop this line:
import pdb; pdb.set_trace()
And run it as normally. Now you can explore. Drop into the function, step forward, go up. You just need to learn to move around. With pdb you can really explore what's going on.
Or even just breakpoint() built-in in python 3.7+
In addition you can set PYTHONBREAKPOINT=ipdb.set_trace (or whatever you prefer) in your shell rc file
I've recently started integrating PyCharm's debugger more instead of python native tools and it really has been a life changer!
My favorite tiny unique feature is that it injects variable values as temp comments in the code as the program runs in debug mode. So you're seeing `foo = get_foo() # "bar"` as your code. Really helps to debug complex algorithms when you can see the values right there in the code.
My only problem is that debugger shell feels very clunky with ideaVim plugin, so if someone at JetBrains reading this that would be a very welcome improvement!
PyCharm is a huge help when dealing with APIs that return huge or complicated data structures or where documentation is vague. Examples: interacting with AWS with boto3; interacting with Kubernetes.
You can read the docs alone, but much better is to set up a live interaction with the API. I set a breakpoint and explore the returned data structure in the debugger. All types and content are clearly visible and are a guide to the next iteration of the code's form.
I use ipython a lot, and it has a great post-error debugger with `%debug` which drops you in a ipdb shell at the stackframe of the error.
Concerning reloading, I find `%autoreload` much more reliable than `importlib.reload`. The latter can't handle `from lib import foo` or if a dependency inside an imported package changed. `%autoreload` just does this all automagically.
I recommend repr in all log messages too. For a work Django project we have common BaseModel repr(obj) expression that can be easily converted (using regexes in loki and promtail) into a clickable admin link.
So within loki (or any similar log tools) all logger.info(repr(object)) is a link straight to the admin. Worth the set up time.
Since PyCharm is abundant in this discussion, here are an alternative:
I've been using the Wing IDE for several years. It sometimes chokes when I try to inspect larger numpy arrays as a whole, using the application-specific array visualization (but there are other ways). Otherwise I'm quite happy with it.
Useful for common commands that you would otherwise have to type out. For example, if you do TDD on a JSON API and often need to debug the output consider this alias:
alias ppr pp response.json()
3. ~/.pdbrc.py
This file loads before pdb starts, so it lets you customize it a bit and add things like history. Here is a simple version to get you started:
When debugging compiled code, I often attach a debugger to the process after it’s started to introspect it. I’ve often wanted to do something similar in Python; does anyone know of a tool that might do something like this?
But there are good tools also. I'm using wing IDE for years, it has excellent debugging: multi threaded debug, remote debug with SSH (which works almost as fast as local debugging), deep introspection to anything on the stack and call chain, conditional breakpoints, powerful debug console that can hook to any step in the call chain of any thread, and debugging a server thread easily which is useful for debugging services and scenarios that can't be fully replicated on a local machine.
I'm somewhat embarrassed to say that I very rarely use debuggers. Print statements have always worked for me even in large codebases. What am I missing here?
Maybe you're not missing anything - if you're productive with print debugging and have used a debugger but found that it doesn't help that much I wouldn't worry about it.
I'm the opposite - immediately run in the debugger to probe around how to use this or that API or verify that the code is really doing what I think. I also find that stepping through my code early on often gives me insight in how to implement something more efficiently.
I'm just one data point, but I also rarely use debuggers and often work with a colleague who rarely uses print statements.
There are qualitative differences in the code we're usually modifying though; in particular, he's often editing code where the control flow is hard to reason about (inheritance towers, pages of conditionals, etc) and is fairly slow to build and run.
In that environment I might even reach for a debugger myself -- e.g., even if you already isolated the bug to a particular portion of the code and just wanted to know which method was being dispatched you would need a ton of print statements, or at least a lot of back-and-forths editing the code and figuring out where you ended up (because of the high branch complexity which could send you far from where you started), and those restarts would be fairly slow in and of themselves. Contrast that with a debugger where you can quickly step forward and find out exactly where you end up.
IMO, print statements are easier/faster to use than debuggers, so I reach for those first unless there's some feature a debugger has that I think will make my life easier overall.
Can anyone recommend a video/series on Python debugging? Debugging has been my greatest struggle with getting into development. I get frustrated and spend over a day on simple scripts
Lots of great comments and resources here! I was surprised to find out how uncommon using a debugger for Python development was when I joined my team at work. I had to muddle through lots of documentation to figure out how to use it in certain situations and also to figure out alternative things when the first approach didn't work. I looked into PyCharm but opted to use VSCode as my main editor since the debugging support seemed easier to configure both for local and especially for remote.
We have a flask app and one hurdle I had to overcome was our docs all assumed you'd run the app via gunicorn, but VSCode had trouble triggering breakpoints so I had to figure out how to run the app via the flask module directly, which was a bit of work since our app isn't following most of the getting started with flask tutorials conventions.
For another project we use celery, VSCode can be used to debug celery tasks but it's also work to set up and in the end I found using the rdb.set_trace() debugging provided by celery was easier. An important lesson I learned while working on that project is that you need to be sure you're setting your breakpoints in the right version of your app: if you are using a setup.py install step the code you want to debug is probably somewhere in site-packages not wherever you installed it from.
For another project, we're using python 2.7 on Centos 6: VSCode debugging & remote tools don't easily support that setup so knowing pdb or doing log debugging are the best bets. Something important on log-based debugging this article doesn't mention: if you're developing a long-running app like a daemon or service of some kind: you should probably make your log config loading dynamic. The one provided in the article is nice if your script is one-off, but if you are running a service you may want to be able to dynamically adjust the logging to be more verbose when an issue occurs and then reset it when done debugging without having to start/stop the whole service. I inherited code that runs as a service and in order to change the log level I have to stop the service, change the config and restart. The start/stop is destructive: if you stop the service the action it was performing has to be redone from the beginning. These are tasks that can take 8-12 hours so restarts are painful. Debug logs can be huge so you can't just leave the service in debug log mode all the time unless you want to fill up the whole disk. Which brings up another point: rotating log files—if you're doing heavy logging you need to be sure you have set up rotation so that you don't eat through disk space indiscriminately.
In my ideal world every project I work on could have breakpoints trigger my editor's tools so I can inspect and alter code on the fly, but knowing there's more than one way to go about it and how to approach it when you don't have control is important.
I feel like I got in a time machine ... this like debugging from 1995. In fact, pretty sure I would have rejected this in 1995. Unless you are constrained to doing this on a server or other restricted environment where reproducing locally is not possible, just get a good IDE and set some breakpoints in there.
The environment is very "VSCode" focused at the moment, hence the roundabout and inefficient ways to do actual debugging. Debugging in a proper python IDE is a breeze and on-par if not better to other traditional IDEs for staticly-typed languages.
- "you might want to just copy-paste it and use it" ... that's maybe what people end up doing, but it's a terrible idea.
- The following `__repr__` is out of alignment with the output printed below it:
class Circle:
def __init__(self, x, y, radius):
self.x = x
self.y = y
self.radius = radius
def __repr__(self):
return f"Rectangle({self.x}, {self.y}, {self.radius})"
The class name is hardcoded; maybe use:
def __repr__(self):
name = self.__class__.__name__
return f"{name}({self.x}, {self.y}, {self.radius})"
- There's a bare `except` in the `traceback` demo. It is not a problem in this toy example and works, but bare `except` statements have no place in educational code. Some poor soul will copy-paste it and run into awful errors down the line (https://realpython.com/the-most-diabolical-python-antipatter...)
Personally, I can't live without Pycharm when working with Python purely because of how fantastic the debugging experience is. The integration with the interactive IPython shell is simply fantastic, and the live variable viewer is also really handy. Between being able to prototype code in the interactive IPython REPL and then graduate it seamlessly to scripts/functions/classes, and then having an amazing debugging UX to fallback on for those bugs that slip through, I find myself to be orders of magnitude more efficient than when coding in something like vscode (which I absolutely love, and want to transition to for Python coding, but it is extremely limited in comparison to Pycharm, especially when it comes to the REPL experience + debugging).