Patching GCC to build Actually Portable Executables

ahgamut · on July 13, 2023

I wrote this post: the title should be "Patching GCC to build Actually Portable Executables", because it refers to Cosmopolitan Libc and jart's Actually Portable Executable format.

With my gcc patch, you can now build software like vim, emacs, ninja, bash, git, gcc etc with Cosmopolitan Libc, via their usual autotools/cmake-style build system. The built executables should run on Linux, FreeBSD, MacOS, OpenBSD, NetBSD, and also Windows (although I haven't tested Windows yet.)

Here's a list of software I got to build with this technique: https://github.com/ahgamut/superconfigure

The superconfigure script is just a wrapper around the usual configure script used to build your software, supplying flags like --enable-static.

If you want to build gcc using Cosmopolitan Libc -- try out this repo: https://github.com/ahgamut/musl-cross-make/tree/gccbuild

endgame · on July 13, 2023

Will this patch of yours make it into GCC upstream? APE executables and cosmopolitan libc are incredibly cool tech, and it'd be great if they were easier to use.

ahgamut · on July 14, 2023

Reading the GCC source code was fun and educational, and I'd love to find out if this could patch make it into GCC. Haven't discussed it with any of the GCC developers yet.

In the meantime, the Cosmopolitan Libc monorepo contains binaries of gcc-11 and binutils that have been built with my patch, you can use those for now :)

foresto · on July 14, 2023

Nice! Do you foresee any problems with using this behind a transpiled language like Nim?

ahgamut · on July 14, 2023

I haven't used Nim much, but I remember a repo on Github had setup a build script for compiling Nim with Cosmopolitan Libc. https://github.com/Yardanico/cosmonim

This gcc patch makes such build scripts simpler, because you will need to change less of your code -- let me know how it works!

ChickeNES · on July 13, 2023

Have you attempted to build busybox?

ahgamut · on July 13, 2023

Yes, I have gotten busybox to build successfully, but at present it's not as nice as bash or vim. I expect someone will write a better build script soon enough. Maybe it will be you!

The default build parameters in busybox make a lot of OS-related assumptions, so I'd recommend you use make menuconfig instead of make depconfig. in the TUI, you can disable all console and networking utilities, linux modprobe utilities, some linux system utilities, free, uptime, etc. Start with a minimal config and then slowly add things.

In terms of source code changes: modify `u_signal_names.c` to use a switch statement instead of the amazing "use SIGHUP as an array index" method it follows, use int32_t instead of the "smallint" type that busybox seems to prefer, and IIRC there's an enum in there somewhere that should be rewritten as a #define.

That should get you pretty close to a successful build.

blueflow · on July 14, 2023

I recently saw that you got banned from lobste.rs for being a sockpuppet from jart and just now i realized that you are an actual, different person.

[unconstructive complaint about lobste.rs moderation skipped]

I hope HN treats you better.

pushcx · on July 15, 2023

I'm the Lobsters admin and I enforced this ban. You misread the modlog, only jart's invitee i2 was sockpuppeting. As jart is here claiming that this is all mistaken sockpuppeting, let me explain what happened.

Lobsters has had serious problems with users who try to exploit the site for self-promotion, and it's more serious for us than HN when a few votes is enough to move something to our homepage. As part of addressing it, we have a bright-line rule prohibiting new users from submitting stories from new, unseen domains in their first 70 days on the site.

Five hours after jart invited woodrush, at 2022-01-12 02:26, they tried to violate the rule and got this error message:

> woodrush.github.io is an unseen domain from a new user. We restrict this to discourage self-promotion and give you time to learn about topicality. Skirting this with a URL shortener or tweet or something will probably earn a ban.

At 2022-01-12 02:40, jart submitted the exact same link to break the restriction, and woodrush submitted several posts from that domain. After that, jart started submitting stories from all of her invitees to circumvent this rule, which anyone can verify by comparing timestamps on her submitted stories at https://lobste.rs/newest/jart to creation dates for her invitees at https://lobste.rs/u#jart.

On top of repeatedly breaking a rule that has a big red background and explicitly warns of a ban, two of jart's invitees were manually warned and banned for other inappropriate self-promotion. Finally, when another of her invitees started obvious sockpuppeting, I looked at the invite tree, saw this pattern of abuse, and cleaned up after it.

To rebut another claim repeated a couple times elsewhere in the comment tree, all banned users get an email notifying them of a ban with the Reply-To header set to my email: https://github.com/lobsters/lobsters/blob/master/app/mailers... None of the users banned in this action have clicked reply to explain how I've made a mistake enforcing an unambiguous rule of ours.

IAmNotACellist · on July 18, 2023

jart is a sociopathic manipulator and abuser. They recently stole code for llama.cpp and claimed it was their's, were unapologetic when caught, and tried to gaslight the llama.cpp creator into feeling guilty and reinstating their access. Prior to that they were trying to install themselves as an important and powerful member of that community. https://news.ycombinator.com/context?id=36245385 and https://news.ycombinator.com/threads?id=IAmNotACellist#36251...

supriyo-biswas · on July 14, 2023

I took a peek at the moderation log and the banning of jart and their descendants was a real shitshow[1]. I’m sure a lot of legit folks lost their accounts, and most of the descendants weren’t even posting anything related to APE or jart’s other projects.

[1] https://lobste.rs/moderations

jart · on July 14, 2023

It's because I'm an insomniac. When normies are sleeping, I'm running around setting up franchises. One of my other sock puppet accounts that got banned by Lobsters is Paul Kulchenko, who isn't actually the genius who built ZeroBrane Studio https://studio.zerobrane.com/, but is just me pretending to be a Windows developer when I can't fall asleep. Lobsters also banned Hikaru Ikuta (woodrush), who isn't actually a PhD at the University of Tokyo who discovered the lambda calculus expression for LISP https://woodrush.github.io/blog/lambdalisp.html, he's clearly just another one of my sockpuppet accounts I created due to being an insomniac. Those are the sorts of stories we've been sharing with the Lobsters community, and we'd been received very positively until now. I'm disheartened by how the recent unexpected actions of this rogue moderator will impact the wonderful community that had been nothing but kind to us.

distcs · on July 14, 2023

This is a terrible misrepresentation of the mod logs. Your comment seems to be creating a strawman. All the accounts you wrote in your comment were not banned for being sockpuppets. They were banned for spamming (whatever that means). Totally different things.

Nobody said they were sockpuppets. Only you are saying that in an attempt to create a strawman and mislead unsuspecting readers. We are not idiots.

The two accounts that were banned for sock-puppeteering were 'i2' and 'thatworkshop'. And you have conveniently omitted these two accounts from your comment. What do you have to say about them? You invited them. And they were sock-puppetting? How do you explain that?

jart · on July 14, 2023

i2 got banned from my community a long time ago so it doesn't surprise me that Lobsters would ban him too. I usually err on the side of trusting people, and sometimes that turns out to be a mistake. I take appropriate action when I need to. If anything I'm surprised it took Lobsters this long. Lobsters gave me no ability to rescind invites, so they shouldn't have held me and my friends responsible for what he did. Frankly I don't even know what that is. Lobsters also made no attempt to contact me or offer feedback. The mass ban came as a total surprise. I have no idea what I did and I wasn't granted any opportunity to make them happy.

distcs · on July 14, 2023

I'm no fan of Lobsters. So I've no skin in the game. But it is one thing to say that those people you wrote about in your comment were banned for spamming and it is a totally different thing to say that they were banned for sockpuppetting (when they were not). The former is correct representation of what's there in the mod logs. The latter is a blatant misrepresentation of the mod logs. This misrepresentation is misleading to unsuspecting readers.

After seeing such wilful blatant misrepresentation, forgive me if I'm having difficulty just taking your word for it.

BitterAmethyst · on July 14, 2023

Given things she's said in the past it's probably best if you take her word on very little.

https://valleywag.gawker.com/why-does-google-employ-a-pro-sl... https://www.dailydot.com/debug/occupy-wall-street-supports-g...

supriyo-biswas · on July 14, 2023

I’m left speechless after reading the first article, as well as seeing the misrepresentation of the comment thread (she was replying to me, not blueflow…)

I must say, I respect her a good deal less than I used to.

blueflow · on July 14, 2023

Don't fall for hit pieces abusing sarcastic pieces of text. Anyone could write a hit piece about you.

IAmNotACellist · on July 18, 2023

See also: https://news.ycombinator.com/context?id=36766152

paulclinger · on July 14, 2023

> But it is one thing to say that those people you wrote about in your comment were banned for spamming and it is a totally different thing to say that they were banned for sockpuppetting (when they were not).

How can I be legitimately banned for spamming when I made zero comments or posts? I rarely post or comment, so did not get a chance, but I was glad to join the community.

distcs · on July 14, 2023

I don't know the specifics but looks very much like collateral damage. Sorry if that's the case. Looks like her whole invite tree became suspect and the mod removed her whole invite tree. Not justifying what they did but seems like a mistake. Talking to the mod should resolve it.

jart · on July 14, 2023

How do we do that? Lobsters has no contact information I could find. I'm happy to change my invite habits so there's a higher bar going forward.

distcs · on July 14, 2023

#lobsters on Libera.Chat IRC

https://lobste.rs/chat

jart · on July 14, 2023

Perhaps your feedback would be better directed to blueflow, because I was responding to their personal take?

blueflow · on July 14, 2023

If you get banned due to another user spamming, you are assumed to be a sockpuppet, even if the moderation log doesn't explicitly say it.

pushcx · on July 15, 2023

This is not the case. (For context, I am the admin responsible for the bans.) I left a longer comment in response to the original misunderstanding to explain what happened here.

distcs · on July 17, 2023

Anyone trying to find where the longer comment is, it's here: https://news.ycombinator.com/item?id=36735833

distcs · on July 14, 2023

Sockpuppets or voting rings may not be posting related posts/comments at the moment but they could in future. And they could be silently +1-ing each other's posts/comments which you can't see but mods can.

I've done moderating work when phpbb forums had their heydey. The clever sockpuppet accounts or voting rings don't start posting about the projects of their main account right from the first day. The clever ones remain dormant, mostly idle, participating in other discussions. They look just like normal accounts.

But much much later when the main account posts a "show project" post, the sockpuppets or voting rings add +1 or post a few words of praise and then move on to commenting on +1-ing other posts.

And voting rings become more difficult to detect when they don't post anything related, post unrelated stuff but quietly upvote the main account's posts/comments.

supriyo-biswas · on July 14, 2023

Without said things actually occurring, there is no reason to say that the accounts are acting as sock puppets.

distcs · on July 14, 2023

Only 2 of all the banned accounts have been flagged as sock-puppeteers. You can't see the sockpuppetery anymore because the mod deletes their posts/comments. But the evidence is still there as negative points. Those profiles have negative points. So they must have posted something sometime, got downvoted, got flagged by users and got detected as sockpuppets.

Sock puppet is only one possibility. Other possibility is legit accounts forming a voting ring. You and me cant see voting actions. Mods can.

The mods have access to server access logs. Mods can see who +1-ed whose posts/comments. So if you need evidence for sock-puppetry or voting-ringery, try asking the mods about it. We can only speculate here. I'm only describing what I've found in the wild from my days of moderating (small) forums.

I mean to say that only because you can't see the other accounts posting related posts/comments does not make them any more likely or any less likely to be sockpuppet or voting ring. They may be legit. But they could be a ring. Or they may be legit. Only the mods with the access logs can tell.

ahgamut · on July 14, 2023

Yeah I found out I got banned when attempting to comment about this blog post. I rarely comment on online forums anyway, because I find I don't know enough to comment about most topics.

IIRC over the last year, I've only commented once on lobste.rs, when it came to something about Python's import system, and only upvoted my own blog posts about Cosmopolitan Libc and related comments + maybe one or two other posts. I disagree with the ban, but I can see why my account would get banned as a spammer (one of the accounts banned alongside mine had a double account I think).

lobste.rs is a wonderful community, and I'm happy to see that someone there atleast mentioned this blog post. Hopefully I can get un-banned at some point.

Luckily I can still comment on HN though!

mgaunard · on July 14, 2023

Building is not such a big achievement, what's more interesting is to know whether it actually works?

As in, was there extensive testing to prove there are no regressions compared to the glibc version?

ahgamut · on July 14, 2023

Right now I use the gcc, git, vim, and ninja APE binaries on my local system for daily work, and so far I have seen no visible regressions.

I agree with you though -- an extensive testing setup will reveal more things for improvement, be it in my patch or in the libc itself. At the moment I only have direct access to Debian 11/Fedora 35/FreeBSD 13, but I expect we will soon find a nice way to run tests across the different operating systems automatically. One of my ideas was to setup a separate drive or folder with all the executables, accessible from different operating systems/VMs, and then run the tests from each. What do you think?

mgaunard · on July 14, 2023

I don't know enough about APE or cosmo to know how risky the changes are.

Here apparently the changes are mostly related to error handling, and I guess that's not something that's very well tested with normal usage.

gdprrrr · on July 14, 2023

I think github actions has windows images

cratermoon · on July 13, 2023

Could this be done for Rust? rustup target add cosmo maybe?

ahgamut · on July 13, 2023

I did that last year :) https://github.com/ahgamut/rust-ape-example

Also got ripgrep to work: https://github.com/ahgamut/ripgrep/tree/cosmopolitan

Might need to update the build scripts a little to handle the latest updates in Rust and Cosmo (we'll need gcc-11 now), but I expect to get Rust working again soon enough.

develatio · on July 13, 2023

There is a ticket in Go's repo about this: https://github.com/golang/go/issues/51900

f33d5173 · on July 14, 2023

I suppose this is all quite neat, but I wonder about the theory behind it. If you're building a new OS target, the logical thing to do is to define new contants. Rather than passing in the OSes constants on every underlying OS, you would translate the OSes constants to be the same everywhere. Then, use of constants as array indexes and case statements would work normally. This might incur more overhead than does the approach described in the article, but it seems a great deal easier, and simpler to use for whoever decides to compile a new program against cosmopolitan libc.

ahgamut · on July 14, 2023

New constants do make compilation a lot easier, but my personal opinion is that the overhead to convert to/from the old constants during runtime is too much.

Look at the list of constants here: https://github.com/jart/cosmopolitan/blob/master/libc/sysv/c...

Every time I used any of these constants, I'd have to load a whole bunch of them into my binary as a large lookup table, and go through that table every time I needed a check in my program. It might not be that slow, but I believe it would definitely be noticeable.

My goal was to make porting easier without changing a lot of source code in either the libc or in the software I was trying to port, and still produce binaries that are close or better in performance. Under those constraints, this gcc patch seemed like the best way to simplify the process.

If I run into enough codebases where SIGHUP is used as an array index initializer, I will probably attempt your suggestion just to measure the tradeoffs. Or you could try it out and let me know if a separate set of constants is better.

1vuio0pswjnm7 · on July 13, 2023

"But I've spent a good chunk of time removing obvious counterexamples, and a lot of popular software builds seamlessly."

Publishing a list of software that will successfully compile might be helpful. Perhaps this already exists.

ahgamut · on July 13, 2023

You can build software like vim, emacs, ninja, bash, git, gcc etc -- here's a list of software I got to build with this technique: https://github.com/ahgamut/superconfigure

The superconfigure script is just a wrapper around the usual configure script used to build your software, supplying flags like --enable-static.

dataangel · on July 13, 2023

Is there any practical purpose for APE? Like who is shipping single executables that need to run on N OSes?

tstack · on July 14, 2023

I use an APE executable as an agent for communicating with remote hosts in the Logfile Navigator (https://lnav.org). While lnav itself is not built as an APE, the agent built into itself is. That agent is transferred to the remote when the user wants to read logs on that host. This way, there is no extra step to determine the type of OS or building in multiple versions of the executable. Here's a short blog post on the subject:

https://lnav.org/2021/05/03/tailing-remote-files.html

electroly · on July 14, 2023

My biggest problem with APE: not portable enough, in the sense that when people adopt APE, they lock themselves into less portability than they might have supported otherwise. I live in an ARM world and because of APE you only offer Intel x64 binaries. All new Macs are ARM64, Raspberry Pi models are a mix of ARM32 and ARM64 (mine is ARM64), and I use exclusively Graviton instances in AWS. Running x64 binaries with QEMU/Rosetta emulation on an ARM system is a cute trick but not something I want to do for a piece of infrastructure. Raspberry Pis are barely fast enough to run native binaries. Consider offering APE for x64 but then still producing ARM binaries the old fashioned way. APE is pushing the idea that x64 is a good bytecode in a world that is moving from x64 to ARM. I'd like to see the project accept that ARM exists and produce an ARM version of APE, maybe with some jart flourish--can we make a cross-platform and cross-architecture fat binary?

paulclinger · on July 14, 2023

> Consider offering APE for x64 but then still producing ARM binaries the old fashioned way.

The recent version of cosmopolitan generates ARM binaries for Linux and MacOS (https://github.com/jart/cosmopolitan#arm; mode aarch64). There is also blink that provides the x86-64 emulation layer for (APE and other) binaries on a variety of platforms (https://github.com/jart/blink).

electroly · on July 14, 2023

I didn't know about that, thanks! It gives me hope that the final goal (fat binary, native ARM and native x64, all platforms on both architectures) is within reach. It looks like there are still some hiccups (the APE loader on M1) but they are much further along than I thought they were. Excellent!

paulclinger · on July 14, 2023

Speaking about fat binaries; @jart posted this a day ago in the #ape channel on project discord:

Just created my first ever fat ape executable. I ran `apelink.com -o fat-ape-binary.com o/tiny/examples/hello2.com.dbg o/aarch64-tiny/examples/hello2.com.dbg` to create an executable that runs on Linux+OpenBSD+NetBSD+FreeBSD on x86_64+aarch64.

cesarb · on July 14, 2023

> APE is pushing the idea that x64 is a good bytecode

And x86-64 is a particularly bad bytecode, for one main reason: its strong memory ordering rules, compared to nearly every other architecture. IIRC, simple things like doing a store in x86-64 have an implicit release barrier, while other architectures can freely reorder the stores unless there's an explicit barrier. This means that a x86-64 emulator on for instance ARM has to either be single-threaded (and would still have issues with shared mmap regions), have special hardware support for x86-compatible memory ordering (as found for instance on recent Apple ARM CPUs, and used for their x86-64 emulation, but not common elsewhere), or add explicit barriers everywhere (which kills the performance).

The only real advantage of x86-64 as a bytecode would be that it has less registers than most other architectures (only 16 registers, while other 64-bit architectures usually have around 32 registers), which allows a 1:1 mapping of the registers on the emulator while still leaving plenty of them free for temporary use.

corysama · on July 14, 2023

Java/Closure/Kotlin programmers. .Net programmers. JS, Python, Ruby, Erlang, every VM-based language programmers. Statically-compiled programs are under the hood of everything, but at the application level in the minority.

APE is a big “Who needs a VM if an AOT compiled language as low level as C can compile once and run universally?”

pjmlp · on July 14, 2023

If libc is the only leaf dependency.

flohofwoe · on July 14, 2023

Which is true for a lot of useful tools that run in a terminal (for many coders this is pretty much everything except an IDE and a browser).

pjmlp · on July 14, 2023

If their world is ISO C and nothing else, quite a boring one actually.

Not even a bit of colour, as terminal escape sequences assume a POSIX host environment, and will throw up gibberish if not.

flohofwoe · on July 14, 2023

Nah, terminal colors via escape sequences work fine on Windows, even on the "traditional" cmd.exe. I use that all the time in cross-platform Python and Typescript cmdline tools (cmd.exe is probably limited to 16 colors though, but that's just how it is).

pjmlp · on July 14, 2023

Windows isn't the only non POSIX one out there...

And it was only one example, another one would be any kind of networking, as it isn't part of ISO C.

flohofwoe · on July 14, 2023

There seems to be a standard socket API in the Cosmopolitan clib though:

https://github.com/jart/cosmopolitan/tree/master/libc/sock

..and that also seems to have WinSock support:

https://github.com/jart/cosmopolitan/blob/master/libc/sock/c...

(also Windows is probably the only non-POSIX OS that matters - and APE is obviously only portable to systems it specifically supports, it can't support systems it doesn't even know about, no matter if it is a "POSIX system" or not).

pjmlp · on July 14, 2023

Yeah, but that is going beyond ISO C, and diving int POSIX C LIB, which is the point I was making already.

up2isomorphism · on July 14, 2023

Portability is not the reason you see VMs everywhere.

londons_explore · on July 14, 2023

I think it is a big part of it.

lmm · on July 14, 2023

I can imagine it making it slightly easier to distribute executables from e.g. maven repositories (where having multiple artifacts for the same id/version combination is cumbersome). I think PyPI might have a similar issue?

Also cross-compiling is painful; a lot of small OSS projects that don't have a build farm with all of the different BSDs etc. to hand simply don't ship binaries for those. Even e.g. kubectl has a binary for Linux but not for any of the BSDs (unless you count OSX). So this seems like an easy win for those.

Timon3 · on July 13, 2023

I'm trying to release future side projects that way. I even build a Redbean executable for any client-only SPAs. The idea of this stuff just working on so many systems is amazing.

nicoburns · on July 13, 2023

Problem with this is that self-modifying code will likely trip security blocks on most platforms, which is probably more annoying for users than having to download separate executables.

ElectricalUnion · on July 14, 2023

If the person needs to run with "security blocks" up, it's statistically unlikely that the same person can randomly download the specific and potentially outdated version of a dependency you need for the code to run in the first place.

flohofwoe · on July 14, 2023

I have a shader compiler which would benefit from a platform agnostic, ready-to-run executable format.

It's too big to be build ad-hoc on the user's machine (contains various big 3rd party C++ dependencies, and takes between 2 and 20 minutes to build, depending on the hardware).

Needs to run on macOS x86-64+ARM, Linux x86-64 (ARM would be nice too of course), and Windows x86-64.

I'm currently looking into WASM, but that needs an installed WASI runtime.

regularfry · on July 14, 2023

A long time ago I wanted an Ansible equivalent that worked by injecting a bootstrap VM, then sending configuration code to run on the remote node, rather than chatting back and forth over an SSH tunnel. A lot of the slowness of applying config was that chattiness, and it would have been a lot faster to avoid it.

I have no idea if there is a tool that actually works like this - I remember an exploit tool that did something similar, but it wasn't quite right - but making that initial bootstrap VM and a useful stdlib an APE executable would also have avoided the "there needs to be a working Python at the other end" issue which persisted in tripping me up for far longer than it should have.

jcelerier · on July 13, 2023

i've been asked this hundreds of times in my life, by customers, bosses, etc...

mhh__ · on July 14, 2023

But they don't know or care the difference between the code being in the same binary or not

ElectricalUnion · on July 14, 2023

But they know and care that "it doesn't run" even if they don't know how and why.

jcelerier · on July 14, 2023

I mean, yes they care, they want to be able to put the executable in a usb key or an email and that it runs on any computer they would plug it into

jjnoakes · on July 13, 2023

Storing executables on file systems shared across operating systems or architectures gets a lot easier if you don't have to have special paths for each os+arch combination.

saagarjha · on July 13, 2023

Doesn’t the executable get overwritten after first invocation?

ElectricalUnion · on July 14, 2023

Only if you specifically run it with the `--assimilate` flag.

jjnoakes · on July 14, 2023

I don't think so.

dataangel · on July 15, 2023

I've seen this solved before by giving each combo a different mount with the same path.

efrecon · on July 14, 2023

In vscode extensions that rely on an external binary to do most of the job?

nonameiguess · on July 14, 2023

It's much more useful for greenfield software. Mad props to whoever wrote the gcc patch just to get something like this work some of the time, but rebuilding existing software is probably a fool's errand. There's a good reason you see things like the rpm package format include not only kernel but distro. Running on "Linux," BSD, Windows, or MacOS is one thing, but Linux is a bunch of different OSes, not one. A whole lot of software out there relies upon assuming you're using a particular DNS provider, filesystem hierarchy that may or may not be FHS-compliant. When I went down a rabbit hole a few years back trying to extend Linux from Scratch to include a bunch more packages, I found that GHC won't build if the file "/etc/os-release" doesn't exist. The file can be empty, but it has to exist.

Someone mentioned busybox elsewhere, which sure, will build with this if you disable a whole bunch of stuff, but busybox is supposed to be the minimal utilities needed to run something close to POSIX-compliant, which obviously assumes your system is a POSIX system. It also includes most of the stuff usually provided by util-linux, which obviously won't run except on Linux. If your software is doing something like trying to read from particular device files to get hardware info from the kernel, sometimes that will work on both Linux and BSD (and even Mac since it's kind of a BSD), but it definitely won't work on Windows.

For software that uses the network, how do you query DNS? If you're calling getaddrinfo, I guess Cosmo libc will do some magic for you to make sure that works everywhere, but a lot of software not written in C is doing something not quite that easily made universal, like just assuming /etc/resolv.conf exists and reading it, which definitely won't work on Windows, or using the native platform APIs, which will only work on one platform.

If you distribute desktop software, I would say good luck. Are you going to comply with the XDG? Well, now I guess it will "work" on Windows, but you're not complying with Windows conventions where you're supposed to put stuff in App\RoamingLow or whatever the hell that I would never remember without looking it up every time. Don't try storing data in the registry, because now it will only work on Windows.

Does your software? It better be present on all systems then. Do you query any environment variables? I don't think LC_TIME and LC_COLLATE are provided anywhere but Linux usually. The XDG_ variables are only Linux. Does Windows have PAGER? I don't even know. I'm pretty sure PATH, USER, and HOME work everywhere.

A truly actually portable executable (TAPE) would require far more than an executable file format that can successfully load into memory and execute its first instruction on any platform. There's a reason vendors just write software that uses a browser's JavaScript engine as a platform instead of the OS's native runtime.

EDIT: I don't know if this is ironic, but I guess consider your choice of programming language, too. Go became popular in part because of the static linking, single executable thing, but it inherently can't be cross-platform since it doesn't use the C library and tries to make all calls into the kernel on its own by keeping track of the syscall table provided by every OS it supports. Well, they're still different on every OS, so an executable that works on Linux will not work anywhere else and vice versa.

tssva · on July 14, 2023

> I'm pretty sure PATH, USER, and HOME work everywhere.

Out of those 3 only PATH works on Windows.

Windows has a USERNAME environment variable, but depending on your needs you may need to combine it with the USERDOMAIN variable to get the complete username, %USERDOMAIN%\%USERNAME%.

Windows has a HOMEPATH environment variable but it contains only the directory for the home folder. Another variable, HOMEDRIVE, contains the drive the home directory exists on, so the complete home location is %HOMEDRIVE%\%HOMEPATH%. In powershell a read-only HOME variable is set pointing to %HOMEDRIVE%\%HOMEPATH%. The home directory is supposed to store user files. User specific applications, OS components, customization and configuration data should be stored in the USERPROFILE directory. This is very frequently but not always the same as %HOMEDRIVE%\%HOMEPATH%.

joombaga · on July 14, 2023

In what circumstance would USERPROFILE differ from HOMEDRIVE\HOMEPATH?

tssva · on July 14, 2023

When users home folders are redirected to a network share. In that circumstance it is also possible that the user's home directory might be HOMESHARE\HOMEPATH instead of HOMEDRIVE\HOMEPATH. This will be the case if the network share isn't mapped to a drive letter. Really HOMESHARE should be checked for and only if it doesn't exist HOMEDRIVE used. I should have pointed that out in my original comment but it was made pre-coffee.

Folder redirection is used often in conjunction with remote desktop services, virtual desktops or just to centralize file storage for ease of backup.

cratermoon · on July 13, 2023

In case anyone is confused. "Actually Portable Executables" https://justine.lol/ape.html

ahgamut · on July 13, 2023

I added a comment mentioning this, but it hasn't changed yet. I expect it will change soon. Perhaps I can ping @dang from here?

cratermoon · on July 13, 2023

Justine's post did a great job teaching me the details of leveraging the various executable formats to make it possible.

jwilk · on July 14, 2023

https://news.ycombinator.com/item?id=36526450

> The only way to get reliable message delivery is to email hn@ycombinator.com.

KerrAvon · on July 13, 2023

The switch-to-if thing is pretty inelegant. It seems like you could fix this by mapping to a platform-neutral set of constants, so you could dereference the runtime symbol with a function -- `switch(CosmoMappErrno(EINVAL))` and `case COSMO_EINVAL:` and so avoid turning the code into goto mush.

jsmith45 · on July 13, 2023

But that won't help with compiling unmodified code. Nobody should worry about gotos being created inside the compiler. It is doing that all over the place anyway.

Unless of course you mean that the compiler should be recognizing switches like this, and instead of rewriting them to if trees, it should be rewriting them to switches, changing the labels to use special cosmo specific constants for each of the values, and wraping the input value to the switch with a call to a function that maps the current runtimes platform's values for these over to corresponding cosmo specific constants (and letting other values pass though unchanged).

That... actually might be a simpler compiler transformation to implement. There would be complexity in needing to recognize which of the multiple sets of of constants are being used, and applying the right call or calls to the switch input to map them. It also require complexity on the library side though (creating these ). Not sure if the author would want to implement it, given they have a working implementation of the other transformation.

Lastly such a mapping approach would have risks of inappropriately mapping, or trickiness like having to invert the input before mapping for code that does the negative errno value thing.

ahgamut · on July 13, 2023

Dereferencing the runtime symbol with a function sounds interesting! If you can show me an example of where it works, I'd be happy to try it out. I like the if-else-goto arrangement because it fit in perfectly with the other parts of gcc -- if you look at my patch[1], you will find that I had to change very little of gcc's existing code to add this capability.

[1]: https://github.com/ahgamut/gcc/tree/portcosmo-11.2

outworlder · on July 13, 2023

Do we even care if the replacement is done automatically?

ylyn · on July 14, 2023

I don't see why you are so surprised by code that indexes using SIGxxx. The signal numbers are well-defined and known to be mostly consecutive and start from 1.

ahgamut · on July 14, 2023

I had tried to build a bunch of different codebases (git, bash, gcc, curl etc.) before trying busybox, and I'd never seen code like that before. Plus that error came alongside a compiler error in my patch that took a long time to debug, so it was a memorable surprise :)

Would you consider a code pattern like that to be common? What other codebases apart from busybox have it? If there are many examples, I might spend some time trying to update my patch to handle patterns of that kind.

_bz2r · on July 13, 2023

This is incredible cool.

mgaunard · on July 14, 2023

If errno is such a big problem, what's the hope in even running the C++ standard library and its std::error_code...

No way to use magic macros there.

foota · on July 14, 2023

The issue _was_ the magic macros, I believe. Normally, the macros expand to a constant, but now they don't, which breaks some things. std::error_code doesn't have the same issue, since it's not a constant?

mgaunard · on July 16, 2023

Well if they need to replace those integer constants with function calls, clearly that breaks everything.

std::error_code is just a wrapper for several different enums. I should have probably just given the posix one, std::errc.

You can't just make the various enumerations of an enum be function calls.

The right implementation would have been to convert error codes from/to posix, rather than just give native error codes and make looking up the reference value a runtime operation.

yjftsjthsd-h · on July 13, 2023

In case the author sees this:

> Of course, my patch isn’t perfect. It can’t handle some anonymous structs, enums, const ints, or amazing things like using SIGILL as an array index [...]

Is "SIGILL" a typo?

inglor_cz · on July 13, 2023

The SIGILL signal is raised when an attempt is made to execute an invalid, privileged, or ill-formed instruction. SIGILL is usually caused by a program error that overlays code with data or by a call to a function that is not linked into the program load module.

[0] https://support.sas.com/documentation/onlinedoc/ccompiler/do...

spacechild1 · on July 13, 2023

Another common cause is trying to execute instructions that the CPU doesn't support, e.g. trying to run AVX2 code on a machine that only supports SSE4.

inglor_cz · on July 13, 2023

Yeah, the only situation when I saw SIGILL "live" was my attempt to run some AES instructions on an emulator that didn't support them.

yjftsjthsd-h · on July 13, 2023

Thanks, didn't know that was an actual signal!

ahgamut · on July 13, 2023

SIGILL is for "illegal instruction", according to https://man7.org/linux/man-pages/man7/signal.7.html

I too thought it was a typo when I caused it to occur while building BLIS, but then I found out it was because I had added some AVX thing to the config that my local computer did not have.

flohofwoe · on July 14, 2023

Some compilers (like gcc) emit an 'ud' instruction (https://www.felixcloutier.com/x86/ud) in some situations where the compiler detects undefined behaviour. IIRC this instruction triggers SIGILL on Linux.

jeroenhd · on July 14, 2023

I ran into this a few days ago when I was running some else's old code that didn't return a value at the end of a method expecting a return value in C. The program would crash with a core dump and the debugger wasn't very clear either. It's been a while since I've needed to dig into the disassembly view.

I don't understand why the compiler wouldn't just error out if it generates code that will never actually succeed, but I'm sure there's some C programmer's explanation for it.

flohofwoe · on July 14, 2023

I'm a C programmer and I don't understand it either to be honest. The compiler obviously knows that it encounters UB in this case because it specifically inserts an ud instruction.

It could at least warn about it (and if it doesn't have the information to generate a useful warning or error at that point where the ud instruction is inserted, then that's obviously a problem that needs fixing).

cratermoon · on July 13, 2023

An array of, say, strings, or functions, indexed according to IPC signal numbers. So array[SIGHUP] would be the same as array[2] so SIGILL would be array[4], or the 5th element of the array.

charonn0 · on July 13, 2023

N.B.: "Portable Executable" or "PE" is also the name of the Windows executable format.

epcoa · on July 13, 2023

Looks like they changed the story headline to Actually Portable Executables (which is the original “correct” name). Other than an audio format relatively few use I suppose APE isn’t the most clashy file format name.

1vuio0pswjnm7 · on July 13, 2023

It seems that the term "portable" in PE refers to portability to different hardware whereas the same term used in APE refers to portability to different operating systems.

throwaway8356 · on July 13, 2023

Portable is a generic term. The PE format is the same for x86 and ARM64 and other architectures.

https://learn.microsoft.com/en-us/windows/win32/debug/pe-for...

> The name "Portable Executable" refers to the fact that the format is not architecture specific

https://en.m.wikipedia.org/wiki/Portable_Executable

> On Windows NT operating systems, PE currently supports the x86-32, x86-64 (AMD64/Intel 64), IA-64, ARM and ARM64 instruction set architectures (ISAs). Prior to Windows 2000, Windows NT (and thus PE) supported the MIPS, Alpha, and PowerPC ISAs. Because PE is used on Windows CE, it continues to support several variants of the MIPS, ARM (including Thumb), and SuperH ISAs.

In my opinion it qualifies as portable

skissane · on July 13, 2023

It is portable in comparison to its main predecessor NE which was x86 only

It also is used by more than just Windows - EFI uses it too. There is no reason in principle why some non-Windows OS couldn’t use it as the native executable format.

xvilka · on July 13, 2023

> more than just Windows - EFI uses it too.

And it was a mistake lobbied by Microsoft. UEFI should have used smaller format like TE everywhere. There's no reason for files without imports and other similar needs from the loader to have the complexity of PE. At first it hindered toolchain support at non-Windows operating systems which was probably the real intention.

intelVISA · on July 14, 2023

> At first it hindered toolchain support at non-Windows operating systems which was probably the real intention.

Working as intended.

KingLancelot · on July 14, 2023

What is TE?

skissane · on July 14, 2023

TE (Terse Executable) is a simplified version of PE defined in some of the UEFI specs. To quote [0]:

> The Terse Executable (TE) image format was created as a mechanism to reduce the overhead of the PE/COFF headers in PE32/PE32+ images, resulting in a corresponding reduction of image sizes for executables running in the PI Architecture environment. Reducing image size provides an opportunity for use of a smaller system flash part.

> TE images, both drivers and applications, are created as PE32 (or PE32+) executables. PE32 is a generic executable image format that is intended to support multiple target systems, processors, and operating systems. As a result, the headers in the image contain information that is not necessarily applicable to all target systems. In an effort to reduce image size, a new executable image header (TE) was created that includes only those fields from the PE/COFF headers required for execution under the PI Architecture. Since this header contains the information required for execution of the image, it can replace the PE/COFF headers from the original image. This specification defines the TE header, the fields in the header, and how they are used in the PI Architecture’s execution environment.

Given it is a modified version of PE, I think it still counts as a member of the PE family of executable formats. It isn't something with a different heritage, such as ELF. Although ELF and PE are ultimately cousins – AT&T invented COFF, and then they invented ELF as a successor to COFF to fix its flaws; other vendors, such as IBM, SGI and DEC, decided to extend COFF to remedy those flaws instead of adopting ELF; and then Microsoft took AT&T COFF and made their own modifications to it to produce PE.

[0] https://uefi.org/sites/default/files/resources/PI_Spec_1_7_A... PDF page 269

Gordonjcp · on July 14, 2023

"If EINVAL is not a compile-time constant, four of the initializations above are not valid"

Wait, why isn't EINVAL a compile-time constant, shouldn't it just be an enum?

londons_explore · on July 14, 2023

It has a different constant value on different OS's (Linux, Mac, Windows, etc).

Therefore the resulting code, which must run on any OS, cannot treat it as a constant.

mike_hock · on July 14, 2023

Thus proving that APEs are not viable, the end.

The assumption that constants are constants is a sound assumption, and I highly doubt many would be willing to rewrite the world to make some "clever hack" work.

ahgamut · on July 14, 2023

> I highly doubt many would be willing to rewrite the world to make some "clever hack" work.

I specifically wanted to avoid rewriting the world in order to port established codebases like git/curl to Cosmopolitan Libc.

My goal with this gcc patch was to answer the question: "what is the minimum amount of source code I would have to manually change in order to port software to Cosmopolitan Libc?". As we find out, sometimes we don't have to change anything to compile code for an Actually Portable Executable -- just do the usual ./configure && make, and a small `objcopy` command at the end to create the APE.

Gordonjcp · on July 14, 2023

Ah, right, so while you and I don't care what numerical value in the enum EINVAL has, if I take that numerical value and stuff the binary into another OS then it means something totally different.

Right, it's not that it *isn't* a constant, it's that its not the *same* constant everywhere?

londons_explore · on July 14, 2023

Correct.