I always struggle with these discussions. I have been using Python a long time and while there ahve definitely been a handful of gotchas over the years, it was never an experience that held me up long enought to think twice about. I am sure there is a better way but what holds me up is how often it impacts other people but not myself. Perhaps my usage is not advanced enough but it always leaves me wondering.
Edit: Just as an additional thought, is one of the main issues when distributing to client machines to what is essentially python executables and the clients could be using different OS? My mind jumps to probably not using Python at that point unless there was a specific dependency/library to using it.
I've encountered frustration with Python packaging in two main areas:
1) Installing applications within Docker containers. While wheels have improved this situation, I was surprised that building a package for easy copying into a container and running without the need for installing build tools and extensions in the final container image was not straightforward, especially coming from other languages.
2) Distributing Python utilities to end users across various platforms in an easily installable manner without requiring them to follow lengthy instructions to set up all the dependencies has been another challenge.
We use Poetry and have largely "solved" #1 with a somewhat complicated Docker build. It works well now, so no one has to think about it much. That made deploying Python server-side fairly easy. However, #2 has been much more of a challenge, and I wonder if that is where other folks in this thread are feeling the most pain.
- they require different instructions for each platform: one for each distribution: it's not possible to give a one liner that will work for all supported OS/architecture combinations,
- distributions are not all updated at the speed of upstream. Sometimes users do not want to wait several years to use a recently released feature.
Part of the problem is the frustration is the curse of choice, the need to research what tool to use.
Before the new crowd of tools, like poetry, the problem was also having to figure out how to chain all the scattered tools together. I likely still have some bias from this experience.
I stopped doing python before getting to adopt any of these tools but when I researched Poetry for a company to use, a fatal flaw is that they copied too much from Rust/Cargo, Cargo was born with its community and they help shaped each other. The rust community has a strong adherance to semver. In Python, you have at mess of versioning schemes (e.g. CalVer) and low quality version requirements. You need a way to override version requirements but they refuse.
Doing cross platform development? The tools that do locking today via `requirements.txt` generate platform-specific lockfiles.
Yes this - I’ve been using python for decades, and been mostly fine (dropping down to ~zero packaging & deployment issues since I started putting all my services into docker images). But AI stuff? Somehow that manages to be a massive pain in the ass every single time
Here are the two main packaging issues I run into, specifically when using Poetry:
1) Lack of support for building extension modules (as mentioned by the article). There is a workaround using an undocumented feature [0], which I've tried, but ultimately decided it was not the right approach. I still use Poetry, but build the extension as a separate step in CI, rather than kludging it into Poetry.
2) Lack of support for offline installs [1], e.g. being able to download the dependencies, copy them to another machine, and perform the install from the downloaded dependencies (similar to using "pip --no-index --find-links=."). Again, you can work around this (by using "poetry export --with-credentials" and "pip download" for fetching the dependencies, then firing up pypiserver [2] to run a local PyPI server on the offline machine), but ideally this would all be a first class feature of Poetry, similar to how it is in pip.
I don't have the capacity to create Pull Requests for addressing these issues with Poetry, and I'm very grateful for the maintainers and those who do contribute. Instead, on the linked issues I share my notes on the matter, in the hope that it may at least help others and potentially get us closer to a solution.
Regardless, I'm sticking with Poetry for now. Though to be fair, the only other Python packaging tools I've used extensively are Pipenv and pip/setuptools. It's time consuming to thoroughly try out these other packaging tools, and is generally lower priority than developing features/fixing bugs, so it's helpful to read about the author's experience with these other tools, such as PDM and Hatch.
>I always see people complaining about Python packaging, but rarely run into issues myself. Maybe it's an OS issue? I use Linux.
No, it's a "what you do with it issue". Not necessarily about the mere "number of dependencies" used, when e.g. someone just makes some conda env and is fine with it.
Things like relocating, provisioning, reproducibility, version updating, cross platform, etc, all have their issues, and it gets worse when you need to build your own packages.
The only time I've ever experienced a memorable amount of pain was when I was working on a project that used one of the newer options like Poetry. Using a virtualenv, a requirements.txt file, and pip I have not had serious issues in 15 years of using Python.
That is why I was wondering. I have produced a lot of software using python, deployed it on different platforms, using containers, not using containers, bare metal, cloud, using complex dependecies, etc. I have not run into a lot of issues and most of them for me were mostly depedency verion related where different versions were required for subdependencies but even then it was not super complicate. I realize more complex scenarios exist but it surprises me that its such a hot topic always with Python.
I'm less pessimistic about packaging than most. No language that I know of has ever attempted a standardization effort like this. I think it will pay off in the long term. But it's taken years to get here and it will take more years to wrap up the project.
> An attempt at a specification was rejected due to “lukewarm reception”, even though there exist at least four implementations which are achieving roughly the same goals, and other ecosystems also went through this before.
Python has been weird lately.
We got structural pattern matching, which is entirely new syntax that is essentially just syntactic sugar and only kind-of solves a problem that relatively few users ever had.
But then we reject __pypackages__ and null-coalescing/safe-navigation operators, which solve problems that everyone has and are unambiguous improvements and modernizations of the language. Even if the PEPs had problems and needed to be rewritten from scratch, there is now approximately zero chance that will ever happen.
I love Python but sometimes I feel like the decisions are arbitrary and do not reflect what is best for the language.
It drives me nuts because I'm a captive user. There's too much momentum for data science and machine learning to switch and be taken seriously in a non-solo workplace. Also I just like Python.
Which problem or usecase would null-coalescing/safe-navigation operators solve, which are not already solved with conditional expression or walrus-operator?
All of those use a single package management tool, which is considered the only correct and acceptable tool to use. That is what everybody wished Python had done 20 years ago, but they didn't, and now we are engaged in this unique experiment, in order to avoid forcing everyone to migrate to one particular tool.
Instead, Python is developing a set of interoperable standards and APIs that any number of build and packaging tools can use. So projects can choose whatever build tool makes sense for them, and and users can build any project via a uniform interface.
As an example, let's say that I am writing a web app which serves a machine learning model. I have three dependencies: a web framework, a database driver, and a machine learning framework. The web framework might be packaged using Flit, the database driver might be packaged using Setuptools, and the ML framework might be packaged using CMake or Meson. And for my own project I might choose to use Hatch or Poetry. When I use Pip to install my dependencies, the details of what tool they were packaged with are completely abstracted away. Even if I need to compile something because there is no binary package published for my system, Pip will automatically install the required build dependencies in an isolated environment, and it will magically know how to build the package, using nothing but the package's own declaration of its "build backend". And then when I want to publish my own work, and users will have the same experience when they need to install my dependencies.
The system doesn't work 100% perfectly in all cases (mostly due to things that are out of scope, like managing shared libraries at the system level), but it actually works most of the time for most people in most situations. And they were no doubt issues transitioning from older systems. But as far as I know, no system like this has ever even been attempted, let alone rolled out to millions of users and working so seamlessly that most people didn't even realize it was happening.
So it's not perfect. And maybe the whole idea is backwards and would've been avoided if there had been a single coherent package management story in the first place. But given the scale of the challenge, I think we should all be a little less eager to wave the "Python steering council is stupid & bad" banner.
Well, I'll start from the original sin for JS/Python/Ruby/PHP/Perl:
0. Java was designed to be fast enough to be comparable with C/C++. Think 2x slower or less, not 10-100x slower. This leads to:
1. Most Java libraries are in Java, they're not native. And they're portable as-is across platforms. If they do have platform specific code, it's their job to make sure they bundle and load everything needed for the specific problem.
2. In 2004 Maven 1 was launched, then in 2005 Maven 2 came with a repository format update. Packages are zip files called jars (Java ARchive), there are also wars (Web ARchive) which are zip-of-zips meant to be unpacked by the application server before the first launch, or ear (Enterprise ARchive), which are also zip-of-zips but I forget the exact details for using these.
3. The Maven repo format is the same, locally and remotely. When you work with Maven (or Gradle, or any Java build tool, since they're ALL compatible with Maven, otherwise nobody would use them), you get a local cache/mirror/proxy of the remote repos.
4. Because Java has CLASSPATH (sort of like PYTHONPATH but probably better, and it probably inspired PYTHONPATH because I'm sure the Java CLASSPATH predates it), packages are not copied over to the local folder when developing. Maven & co just assemble the correct CLASSPATH and everything is referred to directly from the local repo. You don't have venvs or node_modules because those are just silly hacks that aren't needed here.
5. If you need to package Java stuff, the standard approaches are:
- cross platform jars for libraries (these are usually published to the Maven repo and can be used natively by any Java package manager)
- zips for desktop apps; shell scripts or executable launchers inside the zips to launch the apps
- wars for web applications
- ears for huge, enterprise applications
It's obviously very deep and complex when you want to look at everything, but that's it.
Because Java is really close to "Write Once, Run Anywhere" in practice, for most platforms yeah, you just copy over the jar/zip.
The hardest part that people complain is installing the JRE, which is SUPER silly since for a technical person it should be trivial to do.
For non technical person, since at least 10 years ago, you can just bundle the JRE with your app. These days (at least 5+ years), I'm fairly sure you can AOT compile the app, even.
Python, by comparison, is a horror show.
And it all comes from 0. -> Python is slow, it was meant to be used with C libraries, so it carries all the baggage from that ancient and creaky ecosystem. Packaging a Python (or Ruby, or...) app to deploy on Windows, Linux (multiple distros), MacOS is such a horror show that they invented an entire layer with Docker, to just put the whole thing into an almost literal shipping container and not bother with the craziness inside.
> 4. Because Java has CLASSPATH (sort of like PYTHONPATH but probably better, and it probably inspired PYTHONPATH because I'm sure the Java CLASSPATH predates it)
Just pointing out that Python predates Java by a few years. Not sure when the PATH concept was introduced, though.
On paper it predates it but Java was "industrial development ready" from day 1. Python 1 was barely used and it was more of an academic toy of sorts, it took Python 2 to come on the scene for Python to even be mentioned in the same sentence as Java.
I don't know the exact chronology of CLASSPATH versus PYTHONPATH, but I can tell you that CLASSPATH usage is pervasive and has been so from the start, for Java (I was using Maven 2 in 2007 and even then CLASSPATH was established, being also used by the Ant and Eclipse and other, older, tools), while PYTHONPATH is definitely not used to the same effect in Python, it's an afterthought.
You can certainly just delete a venv. `rf -rf venv`
I'm not sure what kind of custom path you need, but you can put a venv anywhere you want that's practical.
Yes, when you create it, you need leave it there. but they're trivial to create. `python3 -m venv [path to venv]` Installing packages is a LOT faster than npm.
There's definitely issues with python's setup as a global interpreter, but these arn't them.
I hate this question. It's basically "why are you stupid"? I'm a professional with 20 years of experience with ecosystems ranging from Java, .NET, Ruby, Python, etc., so if I write something, I probably have a valid use case.
The opposite question should be asked: why can't I do that?
And the answer is, frankly, either lazy/bad design on the Python ecosystem part or backwards compatibility with an existing but bad system. In a well designed system venvs would be by default copy-able to another equivalent system since there's little to lose and a lot to gain, for example, access to the simplest and most reliable deployment system invented in history, the <<copy-paste deployment system>>.
The fact that we have entire generations of developers that can't fathom why someone would want that basically says all it needs to say about the degree of over engineering that's now standard.
From a user point of view Python packaging has probably never been better, but it is still a huge mess from anyone having to maintain these projects which might explain some of the dichotomy in the comments.
In my experience most of these packaging tools work fine for pure Python packages, it is when you try and bundle extensions in a cross-platform way that things get really messy. For better or worse I think third party package managers somewhat outside the Python ecosystem, i.e. conda + conda-forge, are the only tools that get this right.
get this mess fixed up for 3.13 this Fall or 3.14 (pi-thon?) in 2025?
I admire your optimism but don't share it. Crap packaging will be present in the Python experience for the lifetimes of multiple dogs. 2075 at the earliest.
I'm more worried about commercial entities gaining power - by simply employing enough people to work on something that the alternatives cannot compete.
People who want "one" anything - I can sympathise with the desire for simplicity (as long as it's a kind that suits you) but I hope it never happens.
This is a niche concern. There are no commercial package managers that I know of, in any mainstream ecosystem. Even Gradle, which probably comes closest, is perfectly usable without the commercial add-ons.
I'm not thinking of commercial anything - just changes financed by companies that want them.
Example: PyPy getting almost no funding but Microsoft getting it's JIT into cpython.
Big money will be deciding what happens. We're already all dependent on Github - and they've managed to use our code to train their AI. Like a vampire offering poor travellers a bed in it's castle and then sucking their blood :-D
Arguably, Conda is a commercial product that exists to serve the needs of Anaconda first and others second. Fortunately Anaconda has been a very benevolent overlord.
I have a dozens of packages created since 2009 and never had issues with distutils, setuptools also worked fine for me. Even C extensions weren't a problem.
Recently I've tried setup.py-less approach. Oh boy. What a mess "modern" python packaging is. Multiple backends. PEP-517. pip<23 could not install your package. `build` tool includes explicitly excluded files.
At least you can include markdown as a README for the package.
Setuptools is supposed to work with PEP 517 just fine with no changes, other than adding the build-backend declaration in pyproject.toml. Old versions of Pip will find your setup.py and use it directly, and new versions will find it via pyproject.toml and invoke it via `build`. Maybe you hit a bug? Or were doing something unusual and complicated. I never had a problem with the transition in a Setuptools pronect.
Nowadays I am usually using micromamba inside Docker. I honestly never had any issues. I need docker anyway for deployment. And mamba/conda is almost needed for scientific stuff (working with some super niche weather stuff which is only on conda).
IMO the most important thing is to stay flexible, all of them have their tradeoffs pick one that works and move to a different one if it becomes painful.
Docker is very very useful though. Can you imagine the joy I felt when I spun up a 6 month old very complex project and it immediately worked.
I don't know why the PyPA seems to be stubbornly ignoring Poetry, most Python devs I know have switched to it and are very unlikely to switch again to something else. It does what you need it to do, it's stable and reliable these days, and the ergonomics are good.
I kinda wish PEP-621 had gone more down the pyproject route that poetry took, their version looks so much cleaner and more readable at a glance than the PEP-621 version.
Also Poetry integrates well with pip/venv and vice-versa, so you can transition to it step-by-step if your project consists of multiple packages across a multirepo or a monorepo. You don't have to switch everything all at once.
I tried using PDM at my new work place and gave up. I had weird issues and since I had used poetry before, I switched to it. Poetry just works out of the box these days.
For packaging you need more than that. Poetry and Hatch both wrap or replace a bunch of small modular components: Pip, Venv, Build, Twine, Setuptools, Pip Compile (fo lockfiles), and a task runner like Tox, Nox, or Invoke. You can (and people do) use that "stack" instead of Poetry or Hatch, if you prefer a more Unix-style approach of modularity and composability among several single-purpose tools.
There is nothing to maintain. I bump python in the dockerfile if I need to bump it for some reasons, otherwise there is no reason to change anything and it indeed just works. Takes like 2 min of work once a year.
I can rebuild an image from 5 years ago and deploy it on multiple OSes and it will probably still work.
It's actually easier on Mac and Windows, because it runs Docker inside QEMU and it installs the whole toolchain in a single app bundle. I'd pay good money for a Linux equivalent, setting it all up manually is a pain.
Edit: Just as an additional thought, is one of the main issues when distributing to client machines to what is essentially python executables and the clients could be using different OS? My mind jumps to probably not using Python at that point unless there was a specific dependency/library to using it.