I used python to write a small utility that i wanted to share with a coworker, ended up having to rewrite it in Go because i spent 3 hours trying to compile the python version to an exe so that it was easier to run on their machine.
I wonder why isn't there an easy way to just package the interpreter inside an EXE for people that don't care about binary size just so that it would make it seamless to package up utilities written in python.
It's kind of frustrating to deal with python stuff as an user. I wish it didn't have to be like this, because there's a lot of interesting stuff written in python that's nearly impossible to get to run.
There are too many environemnt systems (conda, venv, I think I'm forgetting one more), not all of them build or ship on all systems, then you need to manage them, and sometimes to have the exact correct python version installed because god help you if you're on 3.12 and not 3.11; and set as PATH for pip to find the correct dependencies, but you need to do that before you set up your venv and run pip, otherwise you need to tear that down and star tover. Sometimes the dependencies build, and break, because some package is missing from your system.
With luck, `uv tool install` [1] will solve most of the frustrations with Python programs on Linux/Mac/Windows,
since uv combines the functionality of pipx and pyenv and doesn't itself depend on Python.
It is a single binary that manages Python interpreters and project environments.
uv is very actively developed and seems to be rapidly gaining adoption.
It makes me optimistic about its future (and, correspondingly, pessimistic about the future of Python tooling that competes with it).
I already have a success story of helping a friend install ArchiveBox [2] with uv.
ArchiveBox required an older version of Python than their system had.
The quickest and easiest solution I could think of was:
Yes, just like this. Since 2015 when I started work professionally with python, after each year, a new project shows some kind of "messiah complex" around the python distributing software problem... They usually say in some part of the readme: "That time we will be saved! We have the right tool here that does this and that".
No, it's not solved! And probably will not be solved neither.
Obviously, things get better honestly, 2015 was far worst than now, but currently it's very far from perfect.
For instance, go static build are much superior. Or even Rust, with all the language issues, at least the tooling is good and the software distribution works great.
I've come to realize that putting everything into a container is the only viable way to share a python program.
I'll certainly checkout PEX, I think that distribution of a binary is likely the largest one for Python right now. Java solved it with a JAR file, static-compiled binaries for most compiled languages.
At Google PAR files have been quite a useful way to handle this for at least 18 years now, hope this can get reasonably solved everywhere.
I would say that it's not just a Python problem, Node.js or Ruby seems to have almost the same problems. I'd say Docker exists because packaging C/C++ dependencies is an impossible problem that hasn't been solved in 5 decades. The build tools absolutely suck.
It isn't just the build tools, Unix as the platform for Worse Is Better, has decided to never solve this problem. Docker solves the problem of a having an under powered broken dynamic linker.
> because packaging C/C++ dependencies is an impossible problem that hasn't been solved in 5 decades
C/C++ build systems suck. But deploying a C++ program is much much easier than Python. And Linux is waaaay worse than Windows for both building and deploying imho.
JARs still require a JRE to run, and the runtime needs to be invoked. The equivalent would probably be a Python zipapp (which are just Python files zipped into a .pyz the interpreter can run).
Static binaries are one advantage that languages like Go and Rust have though, yeah.
I also end up avoiding python not because it's a bad language, but because it's so much more convenient to work with compiled artifacts that worry about shipping around a bunch of source or object files and a full compiler. Not to mention the speed of using Go and the ability to write concurrent code without a lot of effort.
The only time I really use python is when I need a quick and dirty script.
But yea I agree the problem with python is there isn’t a tool like deno that just does everything that has to do with packaging, locking, venvs, and building a cross platform binary.
As of a few months ago, pex supports including an interpreter, either directly inline or lazily downloaded (and cached): https://docs.pex-tool.org/scie.html
It's probably on the list that you've tried, but I've had most luck with Nuitka (in standalone mode, not onefile). Unlike pyinstaller, it really is the executable it claims to be, not just an elaborate zip file containing python.exe and your source files.
That often happens when a program is just a self-extractor that puts the real executable in a temp directory and runs that. That's how most Python packagers work (including PyInstaller), but Nuitka only does that in "onefile" mode (it self-extracts the files that make up "standalone" mode).
In "standalone" mode, the executable really is just a normal program. It's actually the various fragments of the Python bytecode interpreter unrolled according to the Python bytecode of your program and the packages it uses. All the C extension modules are shared libraries in the same directory (this means you can use LGPL C extension modules like PyQt BTW) and any package data files are alongside in that directory too. This makes it seem a bit messy but in reality it's cleaner than what onefile does under the hood.
(This is what I was alluding to in my parent comment but I didn't explain myself.)
No it's not, in standalone mode. Even in onefile mode, which is what you're referring to, the contents on the inside are the same as standalone mode, so it's still more than just Python and your source files. It does include the Python interpreter as a library so it can handle calls to `exec()`, but almost all code is unrolled to C and compiled directly.
I've always been curious why Nim never took off as the solution for use cases like this. It combines the simplicity and clarity of Python-like syntax with the performance of a natively compiled language, which looks like the best of both worlds.
3rd party libraries don't spring from nowhere. And no language starts having one in abundance. People have to be motivated enough to write all those libraries in the first place and a lot of them are written just to use python syntax over C Code.
I'm not saying it's the only reason to choose python now but it's definitely among the biggest reasons.
Why not just call the C code you've already written in C ? Because they would rather use python (or python like) syntax.
I don't think we actually disagree here. Even your point about the better C-API doesn't indicate that syntax wasn't a deciding factor, just that one of several options had better compatibility.
My friend is building a tool to do something like this using the actually portable Python from cosmopolitan python: https://github.com/metaist/cosmofy
You run one command that it generates a single executable that can run simultaneously on Mac Linux and windows. Pretty nice for just deploying simple Python scripts.
Seems like i have it in my browsing history, i remember not being able to run the executable it produced, not being able to find a library it was supposed to have. This was just using the default "pyinstaller your_program.py", and i was frustrated enough to not go deeper into why that was. Will definitely give it a try again in the future
PyInstaller-made executables also used to have a habit of getting flagged by security software as malicious (maybe that's why you couldn't run it?) -- apparently, so many malware writers used it that it ruined the party for everyone.
Fortunately, that was only the 32-bit version of Python 2.7. Using 64-bit versions or Python 3 was enough to not get flagged as malicious. I figured that out when I decided I didn't want to teach myself Go just then to deploy something that had worked the day before.
Using the spec files for persistent readable configuration also goes a long way, if you treat pyinstaller as a python module you can automate it whole with just python, including the spec files as it executes them as python scripts
Yep I used it at my last job and it worked great! Startup times were horrible, but that didn't matter so much to us and it solved tons of problems we had with people messing up their python environments. Takes some tweaking to get certain modules (like scikit-rf) to work, but never found an issue that couldn't be solved.
Can you provide any more specifics? I've done the same and had good luck with rust across most Linux, but found Go easier musl, armv7, and for freebsd.
It's been around for years now, is super battle tested, and user tooling continues to get easier. As a bonus, it works for not only python but other tooling as well.
Depends on which part the coworker needed. If it was passing around a reproducible environment, sure, Docker works. But if they needed "here's a thing I can double-click and it'll just work" then Docker has no real advantages.
I'm always on the lookout for ways of packaging Python programs/script-piles into single-file executables, so thank you for posting this!
The GitHub README says it builds on the Python .ZIP App format (PEP441), which in turns says you can put your app in a .ZIP file and Python will run it as an app.
I think I went this route with PyInstaller's one-file output option. But I found it to be too slow because (1) my app was a CLI app and I needed it to start, run, and end quickly, (2) I imported something from Django because it was useful (which ballooned the size), and (3) the single-file .ZIP needed to be extracted into a temporary directory, run, and then deleted every time (!!!).
Does anyone know how Pex / PEP441 deals with this? Is the extract-run-delete thing normal/standard? Is there a way to cache the extracted app?
My guess is that without operating system support or equivalent, there is not an easy way to avoid extracting the zip file. (You could patch syscalls in the Python interpreter to read from the zip file for certain paths, but that might be hacky.)
I did find https://github.com/google/mount-zip which might be useful, but you would likely have to still mount the zip manually. However, you don’t have to worry as much about cleaning up after yourself.
You might find interesting looking at XAR[0] which works by mounting a squashfs filesystem of the "archive", instead of using a zip file. The squashfs is mounted/unmounted "lazily" meaning if you run it a few times in a row it will only mount it one time.
I think the easiest way to do this now is with uv and their support for inline metadata. At this point you can just have the user install uv with a single command and then have the user run the script with `uv run example.py`.
I know it's not very nice to post that here, but given it's hard to get good info about python because the ecosystem is so huge, I'd rather give it straight:
Pex doesn't work on windows, is slower than shiv and shiv solves the resource files problem that you will encounter with dependencies like django. To achieve this, shiv unzips the whole zipapp in a cache on disk and create a transparent proxy to it.
Still, I wish zipapps had better tooling honestly. Using either pex or shiv is more complex than it needs to be.
I hope uv will eventually give you something like "uv zipapp" and make it good and simple.
Right now, to bundle all deps for shiv you have to download first all wheels, then figure out the proper incantation to include them. And if you use WSGI, you probably want some kind of dedicated entry point.
The tech is cool, but it deserves maturing to be a true ".war for python".
FWIW, pex now also has options to unzip the archive to a cache directory on startup (I believe this happens by default now, but am not at a computer to confirm), to side step the zipapp limitations that you reference.
While we're posting not-nice things, I'll throw my current favorite out: pipx ( https://github.com/pypa/pipx ). I use Python's Poetry to build a wheel and then pipx to install it. Super-easy, barely an inconvenience :)
I'm not a pipx expert (I'm a "I got it to work for my use case and then stopped" person :) ) but I was able to get my project working after installing a local wheel file (so i don't think you need access to the Internet to install things).
I thought the wheel file has all the dependencies since (my understanding of) pipx uses a venv for each individual wheel file/app. Certainly I'm using a bunch of packages that I needed to pip install into my development environment, but it's entirely possible that my pipx-managed wheels are using those installs, too.
I think the last two points are accurate, and I've never tried to use APIs like wsgi with it.
Also pipx isn't reproducible - it re-resolves dependencies so you may end up with different versions over time or in different places, eventually causing something to break.
If you have a shiv working it stays working, assuming you have a solution to distribute the required interpreter version.
In the early 2000’s open source hubris, next year was always the year of the Linux desktop. Since then consumer windows matured into a totally ok OS and one with the best support for graphics and C++ development at that.
Non-windows support nowadays is a fairly strong signal of a non-serious software offering if there is no obvious reason for it. And that’s totally fine, hobby tools developed by enthusiasts rock - but they are not industrial in scope as such.
A lot of serious software offerings are only concerned with the server use case, modern servers run Linux unless there's a good reason not to, and modern windows has more than one acceptable way to run Linux binaries if you absolutely have to.
> modern servers run Linux unless there's a good reason not to
I think many people on HN would be surprised at how many orgs are using Windows servers heavily, because of their familiarity and comfort with Microsoft, or because some application requires it.
Of course they are non-tech companies using the servers internally for enterprise applications, not web servers, but there is absolutely a lot of windows server usage in corporate environments.
> I think many people on HN would be surprised at how many orgs are using Windows servers heavily, because of their familiarity and comfort with Microsoft, or because some application requires it.
I think most HN readers are well aware that there are a lot of Windows servers out there, especially in the sorts of environments where it's "The Server".
That doesn't change the fact that there are orders of magnitude more Linux servers in the world, and as the post you replied to said Linux is the default assumption. Basically every container and the vast majority of VM guests are Linux. I'd be willing to bet that more Linux servers have been deployed in the time it took me to type this post than Windows servers will be deployed this week.
I'm not trying to make the point that there is a comparable number of Windows servers to Linux servers, and I believe that going off "number of servers" is of limited usefulness. There are of course going to be far more Linux servers because they are far easier to provision, and the people managing them are generally going to be much more apt at orchestrating large fleets of small servers than the people managing Windows servers, which will tend towards larger servers with multiple purposes.
I'm simply trying to say that Linux is not the default assumption in many contexts where a lot of money gets spent on licenses and services for server software.
There are of course large contexts (tech companies and web servers) where Linux is the default assumption.
It would today, because macos requires signing, and windows has "the mark of the web".
They don't apply to scripts, so Python is fine, but they do apply to executables.
Having to sign your zip every time you produce it would nullify the benefit of the concept. And because the checksum of the executable file change according to the content of the zip, you can't just sign the cosmopolitan part.
- The underlying python code cannot use open() to access its own resources, but must use pkg_resources, and most packages on pypi don't do that because most dev don't even know it exists or what problem is soves.
- _ _ file _ _ is not available, breaking a lot of things. zip_safe is a build option, but actual compliance to it is hard work and rare.
I was thinking more along the lines of using namespaces and FUSE. I'm aware that Python kinda treats them like folders transparently, but not completely for reasons (and more) that you described.
A 2016 comparison by the developer of WinFsp says WinFsp is several times faster in performance tests [1]. According to the readme for Donaky [2], since then Dokany got faster.
I can see how .pex or shiv's zippaps live at the Pareto threshold where converting a Python app to an 80% (or less...) executable takes only 20% (or less!) of the effort that would be spent re-writing the app in eg C, C++, Rust, or Golang. Still, it breaks my heart to see all this work being done to keep developers employed as "Python developers", instead of leveling them up to a broader programming skillset. It seems as if the industry is investing rather heavily in a local maximum.
But, of course, the market is has information that I don't.
> Still, it breaks my heart to see all this work being done to keep developers employed as "Python developers", instead of leveling them up to a broader programming skillset. It seems as if the industry is investing rather heavily in a local maximum.
this is a topic near to my heart; I have spent the last several years building my career around doing this sort of work (specifically writing tools to expand the capabilities of python code). the key idea is not that we "keep people employed as python developers", it is that python (you can also substitute ruby, javascript, php, or any other popular dynamic language) is a genuinely powerful and expressive language from the point of view of converting ideas to code, and having that code be readable, maintainable, and extensible by other humans.
where is falls down is on the "machine facing" end - you don't get the guarantees of static typing, the import system is a bit of a mess, it isn't as fast as lower level, statically-compiled-to-native-code languages, and (as the current discussion shows) packaging and distributing an end user binary is somewhere between a morass and a nightmare. however, none of these things are a problem with the language per se, and they can be fixed with some combination of improving the implementation, writing better tooling (cf valgrind from the c world), making it easier to write small bits of critical code in c/rust/zig/etc, and improving language features while not sacrificing what draws people to use python in the first place.
is python a local maximum? almost certainly; it would be foolish to think that nothing better could come along, and indeed pyret already looks distinctly better in terms of pure dynamic language design. but the answer to that is not to "level up" people who are happy and productive using python to some language like rust or c++ that imposes a heavier cognitive burden on the user to take care of low level details that their applications might not need, or one like golang that pushes the complexity out into the application where problems need to be solved via design patterns that people need to learn and copy, rather than just building them into the language or stdlib.
Thanks for the thoughtful response. I can appreciate that python "is a genuinely powerful and expressive language", and certainly no less so than rust or c++.
> none of these things are a problem with the language per se, and they can be fixed with some combination of improving the implementation, writing better tooling
As a JavaScript developer, I worked through a decade of this thinking, and it produced an oft-ridiculed morass from which we are still emerging. Or do I have that backwards? Were we too satisfied with our incrementally improved tooling all along the way?
> c++... imposes a heavier cognitive burden on the user to take care of low level details that their applications might not need; golang... pushes the complexity out into the application where problems need to be solved via design patterns that people need to learn and copy, rather than just building them into the language or stdlib.
These are intriguing critiques of the languages, and it follows from them that to package an executable, a Pythonista must either "take care of low level details," or "learn and copy... design patterns", all the while sacrificing the qualities of "readable, maintainable, and extensible by other humans". Steep costs, indeed! Arguably enough to justify the overhead of pretty much any Python portability hack.
I don't wholeheartedly agree, but I respect the logic.
It is often the case that the nifty Python thing you want to pass around uses one or more nifty Python libraries that have no C/C++/Rust/Golang equivalent (or no obvious equivalent), and so rewriting it becomes a herculean task.
Eh I wouldn't call PEX proper executables that'd replace the languages above as they still need python, but I still prefer python for things like simple Qt apps even if it means using e.g. PyInstaller
We've been using pex for a long while to package PySpark jobs with dependencies into a single file. Saves a ton of time vs. the old way of building/deploying docker images.
That's fair. Pex does support specifying a target platform though, which can package linux binary dependencies on a Mac for example. That being said, if a native dependency doesn't provide binaries (i.e. source only) we're out of luck. We solve this issue by precompiling and publishing those dependencies on a private package index. Thankfully there are only a handful of dependencies that need this.
We have been using pex to deploy code to running docker containers in ECS to avoid the cold start delay. Cuts down the iteration loop time for development significantly.
I just put everything inside a docker container. But those 9GB containers when you use pytorch and Nvidia cuda libraries. Has anyone tried using this for that type of packaging?
I thought this was going to be a post about plumbing and then I read further and was even more excited to say the phrase "python executable". Very interesting. Thanks!
I wonder why isn't there an easy way to just package the interpreter inside an EXE for people that don't care about binary size just so that it would make it seamless to package up utilities written in python.