I wrote this post: the title should be "Patching GCC to build Actually Portable Executables", because it refers to Cosmopolitan Libc and jart's Actually Portable Executable format.
With my gcc patch, you can now build software like vim, emacs, ninja, bash, git, gcc etc with Cosmopolitan Libc, via their usual autotools/cmake-style build system. The built executables should run on Linux, FreeBSD, MacOS, OpenBSD, NetBSD, and also Windows (although I haven't tested Windows yet.)
Will this patch of yours make it into GCC upstream? APE executables and cosmopolitan libc are incredibly cool tech, and it'd be great if they were easier to use.
Reading the GCC source code was fun and educational, and I'd love to find out if this could patch make it into GCC. Haven't discussed it with any of the GCC developers yet.
In the meantime, the Cosmopolitan Libc monorepo contains binaries of gcc-11 and binutils that have been built with my patch, you can use those for now :)
I haven't used Nim much, but I remember a repo on Github had setup a build script for compiling Nim with Cosmopolitan Libc. https://github.com/Yardanico/cosmonim
This gcc patch makes such build scripts simpler, because you will need to change less of your code -- let me know how it works!
Yes, I have gotten busybox to build successfully, but at present it's not as nice as bash or vim. I expect someone will write a better build script soon enough. Maybe it will be you!
The default build parameters in busybox make a lot of OS-related assumptions, so I'd recommend you use make menuconfig instead of make depconfig. in the TUI, you can disable all console and networking utilities, linux modprobe utilities, some linux system utilities, free, uptime, etc. Start with a minimal config and then slowly add things.
In terms of source code changes: modify `u_signal_names.c` to use a switch statement instead of the amazing "use SIGHUP as an array index" method it follows, use int32_t instead of the "smallint" type that busybox seems to prefer, and IIRC there's an enum in there somewhere that should be rewritten as a #define.
That should get you pretty close to a successful build.
I'm the Lobsters admin and I enforced this ban. You misread the modlog, only jart's invitee i2 was sockpuppeting. As jart is here claiming that this is all mistaken sockpuppeting, let me explain what happened.
Lobsters has had serious problems with users who try to exploit the site for self-promotion, and it's more serious for us than HN when a few votes is enough to move something to our homepage. As part of addressing it, we have a bright-line rule prohibiting new users from submitting stories from new, unseen domains in their first 70 days on the site.
Five hours after jart invited woodrush, at 2022-01-12 02:26, they tried to violate the rule and got this error message:
> woodrush.github.io is an unseen domain from a new user. We restrict this to discourage self-promotion and give you time to learn about topicality. Skirting this with a URL shortener or tweet or something will probably earn a ban.
At 2022-01-12 02:40, jart submitted the exact same link to break the restriction, and woodrush submitted several posts from that domain. After that, jart started submitting stories from all of her invitees to circumvent this rule, which anyone can verify by comparing timestamps on her submitted stories at https://lobste.rs/newest/jart to creation dates for her invitees at https://lobste.rs/u#jart.
On top of repeatedly breaking a rule that has a big red background and explicitly warns of a ban, two of jart's invitees were manually warned and banned for other inappropriate self-promotion. Finally, when another of her invitees started obvious sockpuppeting, I looked at the invite tree, saw this pattern of abuse, and cleaned up after it.
To rebut another claim repeated a couple times elsewhere in the comment tree, all banned users get an email notifying them of a ban with the Reply-To header set to my email: https://github.com/lobsters/lobsters/blob/master/app/mailers... None of the users banned in this action have clicked reply to explain how I've made a mistake enforcing an unambiguous rule of ours.
jart is a sociopathic manipulator and abuser. They recently stole code for llama.cpp and claimed it was their's, were unapologetic when caught, and tried to gaslight the llama.cpp creator into feeling guilty and reinstating their access. Prior to that they were trying to install themselves as an important and powerful member of that community. https://news.ycombinator.com/context?id=36245385 and https://news.ycombinator.com/threads?id=IAmNotACellist#36251...
I took a peek at the moderation log and the banning of jart and their descendants was a real shitshow[1]. I’m sure a lot of legit folks lost their accounts, and most of the descendants weren’t even posting anything related to APE or jart’s other projects.
It's because I'm an insomniac. When normies are sleeping, I'm running around setting up franchises. One of my other sock puppet accounts that got banned by Lobsters is Paul Kulchenko, who isn't actually the genius who built ZeroBrane Studio https://studio.zerobrane.com/, but is just me pretending to be a Windows developer when I can't fall asleep. Lobsters also banned Hikaru Ikuta (woodrush), who isn't actually a PhD at the University of Tokyo who discovered the lambda calculus expression for LISP https://woodrush.github.io/blog/lambdalisp.html, he's clearly just another one of my sockpuppet accounts I created due to being an insomniac. Those are the sorts of stories we've been sharing with the Lobsters community, and we'd been received very positively until now. I'm disheartened by how the recent unexpected actions of this rogue moderator will impact the wonderful community that had been nothing but kind to us.
This is a terrible misrepresentation of the mod logs. Your comment seems to be creating a strawman. All the accounts you wrote in your comment were not banned for being sockpuppets. They were banned for spamming (whatever that means). Totally different things.
Nobody said they were sockpuppets. Only you are saying that in an attempt to create a strawman and mislead unsuspecting readers. We are not idiots.
The two accounts that were banned for sock-puppeteering were 'i2' and 'thatworkshop'. And you have conveniently omitted these two accounts from your comment. What do you have to say about them? You invited them. And they were sock-puppetting? How do you explain that?
i2 got banned from my community a long time ago so it doesn't surprise me that Lobsters would ban him too. I usually err on the side of trusting people, and sometimes that turns out to be a mistake. I take appropriate action when I need to. If anything I'm surprised it took Lobsters this long. Lobsters gave me no ability to rescind invites, so they shouldn't have held me and my friends responsible for what he did. Frankly I don't even know what that is. Lobsters also made no attempt to contact me or offer feedback. The mass ban came as a total surprise. I have no idea what I did and I wasn't granted any opportunity to make them happy.
I'm no fan of Lobsters. So I've no skin in the game. But it is one thing to say that those people you wrote about in your comment were banned for spamming and it is a totally different thing to say that they were banned for sockpuppetting (when they were not). The former is correct representation of what's there in the mod logs. The latter is a blatant misrepresentation of the mod logs. This misrepresentation is misleading to unsuspecting readers.
After seeing such wilful blatant misrepresentation, forgive me if I'm having difficulty just taking your word for it.
I’m left speechless after reading the first article, as well as seeing the misrepresentation of the comment thread (she was replying to me, not blueflow…)
I must say, I respect her a good deal less than I used to.
> But it is one thing to say that those people you wrote about in your comment were banned for spamming and it is a totally different thing to say that they were banned for sockpuppetting (when they were not).
How can I be legitimately banned for spamming when I made zero comments or posts? I rarely post or comment, so did not get a chance, but I was glad to join the community.
I don't know the specifics but looks very much like collateral damage. Sorry if that's the case. Looks like her whole invite tree became suspect and the mod removed her whole invite tree. Not justifying what they did but seems like a mistake. Talking to the mod should resolve it.
This is not the case. (For context, I am the admin responsible for the bans.) I left a longer comment in response to the original misunderstanding to explain what happened here.
Sockpuppets or voting rings may not be posting related posts/comments at the moment but they could in future. And they could be silently +1-ing each other's posts/comments which you can't see but mods can.
I've done moderating work when phpbb forums had their heydey. The clever sockpuppet accounts or voting rings don't start posting about the projects of their main account right from the first day. The clever ones remain dormant, mostly idle, participating in other discussions. They look just like normal accounts.
But much much later when the main account posts a "show project" post, the sockpuppets or voting rings add +1 or post a few words of praise and then move on to commenting on +1-ing other posts.
And voting rings become more difficult to detect when they don't post anything related, post unrelated stuff but quietly upvote the main account's posts/comments.
Only 2 of all the banned accounts have been flagged as sock-puppeteers. You can't see the sockpuppetery anymore because the mod deletes their posts/comments. But the evidence is still there as negative points. Those profiles have negative points. So they must have posted something sometime, got downvoted, got flagged by users and got detected as sockpuppets.
Sock puppet is only one possibility. Other possibility is legit accounts forming a voting ring. You and me cant see voting actions. Mods can.
The mods have access to server access logs. Mods can see who +1-ed whose posts/comments. So if you need evidence for sock-puppetry or voting-ringery, try asking the mods about it. We can only speculate here. I'm only describing what I've found in the wild from my days of moderating (small) forums.
I mean to say that only because you can't see the other accounts posting related posts/comments does not make them any more likely or any less likely to be sockpuppet or voting ring. They may be legit. But they could be a ring. Or they may be legit. Only the mods with the access logs can tell.
Yeah I found out I got banned when attempting to comment about this blog post. I rarely comment on online forums anyway, because I find I don't know enough to comment about most topics.
IIRC over the last year, I've only commented once on lobste.rs, when it came to something about Python's import system, and only upvoted my own blog posts about Cosmopolitan Libc and related comments + maybe one or two other posts. I disagree with the ban, but I can see why my account would get banned as a spammer (one of the accounts banned alongside mine had a double account I think).
lobste.rs is a wonderful community, and I'm happy to see that someone there atleast mentioned this blog post. Hopefully I can get un-banned at some point.
Right now I use the gcc, git, vim, and ninja APE binaries on my local system for daily work, and so far I have seen no visible regressions.
I agree with you though -- an extensive testing setup will reveal more things for improvement, be it in my patch or in the libc itself. At the moment I only have direct access to Debian 11/Fedora 35/FreeBSD 13, but I expect we will soon find a nice way to run tests across the different operating systems automatically. One of my ideas was to setup a separate drive or folder with all the executables, accessible from different operating systems/VMs, and then run the tests from each. What do you think?
Might need to update the build scripts a little to handle the latest updates in Rust and Cosmo (we'll need gcc-11 now), but I expect to get Rust working again soon enough.
I suppose this is all quite neat, but I wonder about the theory behind it. If you're building a new OS target, the logical thing to do is to define new contants. Rather than passing in the OSes constants on every underlying OS, you would translate the OSes constants to be the same everywhere. Then, use of constants as array indexes and case statements would work normally. This might incur more overhead than does the approach described in the article, but it seems a great deal easier, and simpler to use for whoever decides to compile a new program against cosmopolitan libc.
New constants do make compilation a lot easier, but my personal opinion is that the overhead to convert to/from the old constants during runtime is too much.
Every time I used any of these constants, I'd have to load a whole bunch of them into my binary as a large lookup table, and go through that table every time I needed a check in my program. It might not be that slow, but I believe it would definitely be noticeable.
My goal was to make porting easier without changing a lot of source code in either the libc or in the software I was trying to port, and still produce binaries that are close or better in performance. Under those constraints, this gcc patch seemed like the best way to simplify the process.
If I run into enough codebases where SIGHUP is used as an array index initializer, I will probably attempt your suggestion just to measure the tradeoffs. Or you could try it out and let me know if a separate set of constants is better.
You can build software like vim, emacs, ninja, bash, git, gcc etc -- here's a list of software I got to build with this technique: https://github.com/ahgamut/superconfigure
The superconfigure script is just a wrapper around the usual configure script used to build your software, supplying flags like --enable-static.
I use an APE executable as an agent for communicating with remote hosts in the Logfile Navigator (https://lnav.org). While lnav itself is not built as an APE, the agent built into itself is. That agent is transferred to the remote when the user wants to read logs on that host. This way, there is no extra step to determine the type of OS or building in multiple versions of the executable. Here's a short blog post on the subject:
My biggest problem with APE: not portable enough, in the sense that when people adopt APE, they lock themselves into less portability than they might have supported otherwise. I live in an ARM world and because of APE you only offer Intel x64 binaries. All new Macs are ARM64, Raspberry Pi models are a mix of ARM32 and ARM64 (mine is ARM64), and I use exclusively Graviton instances in AWS. Running x64 binaries with QEMU/Rosetta emulation on an ARM system is a cute trick but not something I want to do for a piece of infrastructure. Raspberry Pis are barely fast enough to run native binaries. Consider offering APE for x64 but then still producing ARM binaries the old fashioned way. APE is pushing the idea that x64 is a good bytecode in a world that is moving from x64 to ARM. I'd like to see the project accept that ARM exists and produce an ARM version of APE, maybe with some jart flourish--can we make a cross-platform and cross-architecture fat binary?
> Consider offering APE for x64 but then still producing ARM binaries the old fashioned way.
The recent version of cosmopolitan generates ARM binaries for Linux and MacOS (https://github.com/jart/cosmopolitan#arm; mode aarch64). There is also blink that provides the x86-64 emulation layer for (APE and other) binaries on a variety of platforms (https://github.com/jart/blink).
I didn't know about that, thanks! It gives me hope that the final goal (fat binary, native ARM and native x64, all platforms on both architectures) is within reach. It looks like there are still some hiccups (the APE loader on M1) but they are much further along than I thought they were. Excellent!
Speaking about fat binaries; @jart posted this a day ago in the #ape channel on project discord:
Just created my first ever fat ape executable. I ran `apelink.com -o fat-ape-binary.com o/tiny/examples/hello2.com.dbg o/aarch64-tiny/examples/hello2.com.dbg` to create an executable that runs on Linux+OpenBSD+NetBSD+FreeBSD on x86_64+aarch64.
> APE is pushing the idea that x64 is a good bytecode
And x86-64 is a particularly bad bytecode, for one main reason: its strong memory ordering rules, compared to nearly every other architecture. IIRC, simple things like doing a store in x86-64 have an implicit release barrier, while other architectures can freely reorder the stores unless there's an explicit barrier. This means that a x86-64 emulator on for instance ARM has to either be single-threaded (and would still have issues with shared mmap regions), have special hardware support for x86-compatible memory ordering (as found for instance on recent Apple ARM CPUs, and used for their x86-64 emulation, but not common elsewhere), or add explicit barriers everywhere (which kills the performance).
The only real advantage of x86-64 as a bytecode would be that it has less registers than most other architectures (only 16 registers, while other 64-bit architectures usually have around 32 registers), which allows a 1:1 mapping of the registers on the emulator while still leaving plenty of them free for temporary use.
Java/Closure/Kotlin programmers. .Net programmers. JS, Python, Ruby, Erlang, every VM-based language programmers. Statically-compiled programs are under the hood of everything, but at the application level in the minority.
APE is a big “Who needs a VM if an AOT compiled language as low level as C can compile once and run universally?”
Nah, terminal colors via escape sequences work fine on Windows, even on the "traditional" cmd.exe. I use that all the time in cross-platform Python and Typescript cmdline tools (cmd.exe is probably limited to 16 colors though, but that's just how it is).
(also Windows is probably the only non-POSIX OS that matters - and APE is obviously only portable to systems it specifically supports, it can't support systems it doesn't even know about, no matter if it is a "POSIX system" or not).
I can imagine it making it slightly easier to distribute executables from e.g. maven repositories (where having multiple artifacts for the same id/version combination is cumbersome). I think PyPI might have a similar issue?
Also cross-compiling is painful; a lot of small OSS projects that don't have a build farm with all of the different BSDs etc. to hand simply don't ship binaries for those. Even e.g. kubectl has a binary for Linux but not for any of the BSDs (unless you count OSX). So this seems like an easy win for those.
I'm trying to release future side projects that way. I even build a Redbean executable for any client-only SPAs. The idea of this stuff just working on so many systems is amazing.
Problem with this is that self-modifying code will likely trip security blocks on most platforms, which is probably more annoying for users than having to download separate executables.
If the person needs to run with "security blocks" up, it's statistically unlikely that the same person can randomly download the specific and potentially outdated version of a dependency you need for the code to run in the first place.
I have a shader compiler which would benefit from a platform agnostic, ready-to-run executable format.
It's too big to be build ad-hoc on the user's machine (contains various big 3rd party C++ dependencies, and takes between 2 and 20 minutes to build, depending on the hardware).
Needs to run on macOS x86-64+ARM, Linux x86-64 (ARM would be nice too of course), and Windows x86-64.
I'm currently looking into WASM, but that needs an installed WASI runtime.
A long time ago I wanted an Ansible equivalent that worked by injecting a bootstrap VM, then sending configuration code to run on the remote node, rather than chatting back and forth over an SSH tunnel. A lot of the slowness of applying config was that chattiness, and it would have been a lot faster to avoid it.
I have no idea if there is a tool that actually works like this - I remember an exploit tool that did something similar, but it wasn't quite right - but making that initial bootstrap VM and a useful stdlib an APE executable would also have avoided the "there needs to be a working Python at the other end" issue which persisted in tripping me up for far longer than it should have.
Storing executables on file systems shared across operating systems or architectures gets a lot easier if you don't have to have special paths for each os+arch combination.
It's much more useful for greenfield software. Mad props to whoever wrote the gcc patch just to get something like this work some of the time, but rebuilding existing software is probably a fool's errand. There's a good reason you see things like the rpm package format include not only kernel but distro. Running on "Linux," BSD, Windows, or MacOS is one thing, but Linux is a bunch of different OSes, not one. A whole lot of software out there relies upon assuming you're using a particular DNS provider, filesystem hierarchy that may or may not be FHS-compliant. When I went down a rabbit hole a few years back trying to extend Linux from Scratch to include a bunch more packages, I found that GHC won't build if the file "/etc/os-release" doesn't exist. The file can be empty, but it has to exist.
Someone mentioned busybox elsewhere, which sure, will build with this if you disable a whole bunch of stuff, but busybox is supposed to be the minimal utilities needed to run something close to POSIX-compliant, which obviously assumes your system is a POSIX system. It also includes most of the stuff usually provided by util-linux, which obviously won't run except on Linux. If your software is doing something like trying to read from particular device files to get hardware info from the kernel, sometimes that will work on both Linux and BSD (and even Mac since it's kind of a BSD), but it definitely won't work on Windows.
For software that uses the network, how do you query DNS? If you're calling getaddrinfo, I guess Cosmo libc will do some magic for you to make sure that works everywhere, but a lot of software not written in C is doing something not quite that easily made universal, like just assuming /etc/resolv.conf exists and reading it, which definitely won't work on Windows, or using the native platform APIs, which will only work on one platform.
If you distribute desktop software, I would say good luck. Are you going to comply with the XDG? Well, now I guess it will "work" on Windows, but you're not complying with Windows conventions where you're supposed to put stuff in App\RoamingLow or whatever the hell that I would never remember without looking it up every time. Don't try storing data in the registry, because now it will only work on Windows.
Does your software? It better be present on all systems then. Do you query any environment variables? I don't think LC_TIME and LC_COLLATE are provided anywhere but Linux usually. The XDG_ variables are only Linux. Does Windows have PAGER? I don't even know. I'm pretty sure PATH, USER, and HOME work everywhere.
A truly actually portable executable (TAPE) would require far more than an executable file format that can successfully load into memory and execute its first instruction on any platform. There's a reason vendors just write software that uses a browser's JavaScript engine as a platform instead of the OS's native runtime.
EDIT: I don't know if this is ironic, but I guess consider your choice of programming language, too. Go became popular in part because of the static linking, single executable thing, but it inherently can't be cross-platform since it doesn't use the C library and tries to make all calls into the kernel on its own by keeping track of the syscall table provided by every OS it supports. Well, they're still different on every OS, so an executable that works on Linux will not work anywhere else and vice versa.
> I'm pretty sure PATH, USER, and HOME work everywhere.
Out of those 3 only PATH works on Windows.
Windows has a USERNAME environment variable, but depending on your needs you may need to combine it with the USERDOMAIN variable to get the complete username, %USERDOMAIN%\%USERNAME%.
Windows has a HOMEPATH environment variable but it contains only the directory for the home folder. Another variable, HOMEDRIVE, contains the drive the home directory exists on, so the complete home location is %HOMEDRIVE%\%HOMEPATH%. In powershell a read-only HOME variable is set pointing to %HOMEDRIVE%\%HOMEPATH%. The home directory is supposed to store user files. User specific applications, OS components, customization and configuration data should be stored in the USERPROFILE directory. This is very frequently but not always the same as %HOMEDRIVE%\%HOMEPATH%.
When users home folders are redirected to a network share. In that circumstance it is also possible that the user's home directory might be HOMESHARE\HOMEPATH instead of HOMEDRIVE\HOMEPATH. This will be the case if the network share isn't mapped to a drive letter. Really HOMESHARE should be checked for and only if it doesn't exist HOMEDRIVE used. I should have pointed that out in my original comment but it was made pre-coffee.
Folder redirection is used often in conjunction with remote desktop services, virtual desktops or just to centralize file storage for ease of backup.
The switch-to-if thing is pretty inelegant. It seems like you could fix this by mapping to a platform-neutral set of constants, so you could dereference the runtime symbol with a function -- `switch(CosmoMappErrno(EINVAL))` and `case COSMO_EINVAL:` and so avoid turning the code into goto mush.
But that won't help with compiling unmodified code. Nobody should worry about gotos being created inside the compiler. It is doing that all over the place anyway.
Unless of course you mean that the compiler should be recognizing switches like this, and instead of rewriting them to if trees, it should be rewriting them to switches, changing the labels to use special cosmo specific constants for each of the values, and wraping the input value to the switch with a call to a function that maps the current runtimes platform's values for these over to corresponding cosmo specific constants (and letting other values pass though unchanged).
That... actually might be a simpler compiler transformation to implement. There would be complexity in needing to recognize which of the multiple sets of of constants are being used, and applying the right call or calls to the switch input to map them. It also require complexity on the library side though (creating these ). Not sure if the author would want to implement it, given they have a working implementation of the other transformation.
Lastly such a mapping approach would have risks of inappropriately mapping, or trickiness like having to invert the input before mapping for code that does the negative errno value thing.
Dereferencing the runtime symbol with a function sounds interesting! If you can show me an example of where it works, I'd be happy to try it out. I like the if-else-goto arrangement because it fit in perfectly with the other parts of gcc -- if you look at my patch[1], you will find that I had to change very little of gcc's existing code to add this capability.
I don't see why you are so surprised by code that indexes using SIGxxx. The signal numbers are well-defined and known to be mostly consecutive and start from 1.
I had tried to build a bunch of different codebases (git, bash, gcc, curl etc.) before trying busybox, and I'd never seen code like that before. Plus that error came alongside a compiler error in my patch that took a long time to debug, so it was a memorable surprise :)
Would you consider a code pattern like that to be common? What other codebases apart from busybox have it? If there are many examples, I might spend some time trying to update my patch to handle patterns of that kind.
The issue _was_ the magic macros, I believe. Normally, the macros expand to a constant, but now they don't, which breaks some things. std::error_code doesn't have the same issue, since it's not a constant?
Well if they need to replace those integer constants with function calls, clearly that breaks everything.
std::error_code is just a wrapper for several different enums. I should have probably just given the posix one, std::errc.
You can't just make the various enumerations of an enum be function calls.
The right implementation would have been to convert error codes from/to posix, rather than just give native error codes and make looking up the reference value a runtime operation.
> Of course, my patch isn’t perfect. It can’t handle some anonymous structs, enums, const ints, or amazing things like using SIGILL as an array index [...]
The SIGILL signal is raised when an attempt is made to execute an invalid, privileged, or ill-formed instruction. SIGILL is usually caused by a program error that overlays code with data or by a call to a function that is not linked into the program load module.
Another common cause is trying to execute instructions that the CPU doesn't support, e.g. trying to run AVX2 code on a machine that only supports SSE4.
I too thought it was a typo when I caused it to occur while building BLIS, but then I found out it was because I had added some AVX thing to the config that my local computer did not have.
Some compilers (like gcc) emit an 'ud' instruction (https://www.felixcloutier.com/x86/ud) in some situations where the compiler detects undefined behaviour. IIRC this instruction triggers SIGILL on Linux.
I ran into this a few days ago when I was running some else's old code that didn't return a value at the end of a method expecting a return value in C. The program would crash with a core dump and the debugger wasn't very clear either. It's been a while since I've needed to dig into the disassembly view.
I don't understand why the compiler wouldn't just error out if it generates code that will never actually succeed, but I'm sure there's some C programmer's explanation for it.
I'm a C programmer and I don't understand it either to be honest. The compiler obviously knows that it encounters UB in this case because it specifically inserts an ud instruction.
It could at least warn about it (and if it doesn't have the information to generate a useful warning or error at that point where the ud instruction is inserted, then that's obviously a problem that needs fixing).
An array of, say, strings, or functions, indexed according to IPC signal numbers. So array[SIGHUP] would be the same as array[2] so SIGILL would be array[4], or the 5th element of the array.
Looks like they changed the story headline to Actually Portable Executables (which is the original “correct” name). Other than an audio format relatively few use I suppose APE isn’t the most clashy file format name.
It seems that the term "portable" in PE refers to portability to different hardware whereas the same term used in APE refers to portability to different operating systems.
> On Windows NT operating systems, PE currently supports the x86-32, x86-64 (AMD64/Intel 64), IA-64, ARM and ARM64 instruction set architectures (ISAs). Prior to Windows 2000, Windows NT (and thus PE) supported the MIPS, Alpha, and PowerPC ISAs. Because PE is used on Windows CE, it continues to support several variants of the MIPS, ARM (including Thumb), and SuperH ISAs.
It is portable in comparison to its main predecessor NE which was x86 only
It also is used by more than just Windows - EFI uses it too. There is no reason in principle why some non-Windows OS couldn’t use it as the native executable format.
And it was a mistake lobbied by Microsoft. UEFI should have used smaller format like TE everywhere. There's no reason for files without imports and other similar needs from the loader to have the complexity of PE. At first it hindered toolchain support at non-Windows operating systems which was probably the real intention.
TE (Terse Executable) is a simplified version of PE defined in some of the UEFI specs. To quote [0]:
> The Terse Executable (TE) image format was created as a mechanism to reduce the overhead of the PE/COFF headers in PE32/PE32+ images, resulting in a corresponding reduction of image sizes for executables running in the PI Architecture environment. Reducing image size provides an opportunity for use of a smaller system flash part.
> TE images, both drivers and applications, are created as PE32 (or PE32+) executables. PE32 is a generic executable image format that is intended to support multiple target systems, processors, and operating systems. As a result, the headers in the image contain information that is not necessarily applicable to all target systems. In an effort to reduce image size, a new executable image header (TE) was created that includes only those fields from the PE/COFF headers required for execution under the PI Architecture. Since this header contains the information required for execution of the image, it can replace the PE/COFF headers from the original image. This specification defines the TE header, the fields in the header, and how they are used in the PI Architecture’s execution environment.
Given it is a modified version of PE, I think it still counts as a member of the PE family of executable formats. It isn't something with a different heritage, such as ELF. Although ELF and PE are ultimately cousins – AT&T invented COFF, and then they invented ELF as a successor to COFF to fix its flaws; other vendors, such as IBM, SGI and DEC, decided to extend COFF to remedy those flaws instead of adopting ELF; and then Microsoft took AT&T COFF and made their own modifications to it to produce PE.
The assumption that constants are constants is a sound assumption, and I highly doubt many would be willing to rewrite the world to make some "clever hack" work.
> I highly doubt many would be willing to rewrite the world to make some "clever hack" work.
I specifically wanted to avoid rewriting the world in order to port established codebases like git/curl to Cosmopolitan Libc.
My goal with this gcc patch was to answer the question: "what is the minimum amount of source code I would have to manually change in order to port software to Cosmopolitan Libc?". As we find out, sometimes we don't have to change anything to compile code for an Actually Portable Executable -- just do the usual ./configure && make, and a small `objcopy` command at the end to create the APE.
Ah, right, so while you and I don't care what numerical value in the enum EINVAL has, if I take that numerical value and stuff the binary into another OS then it means something totally different.
Right, it's not that it *isn't* a constant, it's that its not the *same* constant everywhere?
With my gcc patch, you can now build software like vim, emacs, ninja, bash, git, gcc etc with Cosmopolitan Libc, via their usual autotools/cmake-style build system. The built executables should run on Linux, FreeBSD, MacOS, OpenBSD, NetBSD, and also Windows (although I haven't tested Windows yet.)
Here's a list of software I got to build with this technique: https://github.com/ahgamut/superconfigure
The superconfigure script is just a wrapper around the usual configure script used to build your software, supplying flags like --enable-static.
If you want to build gcc using Cosmopolitan Libc -- try out this repo: https://github.com/ahgamut/musl-cross-make/tree/gccbuild