This article is just confused and wrong. Some examples:
"The Socket API, local IPC, and shared memory pretty much assume two programs"
There's no truth to this whatsoever - it's an utterly absurd statement because:
* The socket API as applied to networking is empirically designed for arbitrary numbers of peers -- there are unlimited real world examples. The socket API as applied to local IPC is of course similarly capable.
* Shared memory - mmap() is about as generic of a memory sharing interface as one could hope for. It certainly works just fine with arbitrary numbers of clients. There are again countless examples of using shared memory with arbitrary participants -- for example all the various local file databases: sqlite, leveldb, berkeley db, etc.
"We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it."
Yes, this is exactly what mmap() already does. mmap() is part of POSIX. Use it!
"Kode Vicious, known to mere mortals as George V. Neville-Neil,"
> * The socket API as applied to networking is empirically designed for arbitrary numbers of peers -- there are unlimited real world examples. The socket API as applied to local IPC is of course similarly capable.
I think "the [BSD] socket API" was designed for a thousand or so peers (FD_SETSIZE). That's why the API evolved. Now (2023) there are a lot of APIs and different choices you can make (blocking, polling readiness, sigio, aio, iocp, iouring), none of which is best at everything, but the non-POSIX apis are much faster than the POSIX api -- especially the ones that are harder to mix with the POSIX api.
> "We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it."
> Yes, this is exactly what mmap() already does. mmap() is part of POSIX. Use it!
This is not what mmap() does, but the opposite. mmap() sets up page-tables for demand-paging. Those little page-faults actually trigger a check (demand) to see if the page is "in memory" (this is what Linux calls the "block cache"), an update to the page-tables to point to where it is and returns (or kick off a read IO operation to the backing store). These page-faults add up. What the author is looking for in Linux is mremap() and Darwin is mach_vm_remap() and definitely not in POSIX.
> * Shared memory - mmap() is about as generic of a memory sharing interface as one could hope for. It certainly works just fine with arbitrary numbers of clients. There are again countless examples of using shared memory with arbitrary participants -- for example all the various local file databases: sqlite, leveldb, berkeley db, etc.
But none of those "local file databases" can handle hundreds of thousands of clients, and aren't great even for the low hundreds(!), that's why most "big" database vendors avoid "just" using mmap(), and instead have to perform complex contortions involving things like mach_vm_remap/mremap/O_DIRECT/sigaction+SIGSEGV. userfaultfd and memfd are other examples of recent evolutions in these APIs.
You are looking for truth? Evolving APIs are evidence that people are unhappy with these interfaces, and these newer APIs are better at some things than the old, demonstrating that the old APIs are not ideal.
So we have evidence our (programming) model is not ideal, are we to be like Copernicus and look for a better model (better APIs)? Or are we to emulate Tolosani and Ingoli?
> You are looking for truth? Evolving APIs are evidence that people are unhappy with these interfaces,
That tracks, but this does not necessarily follow:
> demonstrating that the old APIs are not ideal.
I've been around the tech industry long enough to see several cycles of practitioners going back and forth between technologies and approaches. E.g. strong type systems (C++, Java) -> loose type systems (Python, JS) -> strong type systems (Rust, TypeScript). And each time the tide shifts there are always plenty of arguments trying to show why the previous approach was objectively worse and the new one is better.
That's not to say that nothing is moving forward, but the fact that people are unhappy with a technology doesn't mean that technology is objectively inferior to a proposed replacement. Sometimes it's just fashion.
> That's not to say that nothing is moving forward, but the fact that people are unhappy with a technology doesn't mean that technology is objectively inferior to a proposed replacement. Sometimes it's just fashion.
I apologise for writing so much that you couldn't read all of it, but this part, whilst buried at the end of the first paragraph was really important:
> > the non-POSIX apis are much faster than the POSIX api -- especially the ones that are harder to mix with the POSIX api.
That they are better in this objective way is why this particular kind of unhappiness demonstrates what I say, even if you're not otherwise interested in people being happy.
" think "the [BSD] socket API" was designed for a thousand or so peers "
I think a more accurate framing is that in the 80s POSIX was designed for a thousand or so peers, and later versions have expanded and scaled accordingly. Literally every interface has these types of growth patterns so this shouldn't be a surprise.
As I noted below, we have semi-standardized extensions from POSIX. We don't really need POSIX to codify a new event system because we have libev/libevent. This is another point the parent article gets wrong when it says we need to get POSIX "off our necks." Nowhere does POSIX compliance stand in the way of using libev, or even using mremap(). I don't agree with your conclusion around mremap() above but it doesn't matter because the salient point is that mremap() works absolutely fine within the paradigm offered by mmap() and POSIX. libev works just fine within the paradigm offered by the file/socket api.
"But none of those "local file databases" can handle hundreds of thousands of clients"
This is absolutely not true. Those systems work just fine with even a million reader threads.
Maybe you mean the databases don't perform well under parallel writes, but this has nothing to do with the shared memory interface (how could it?). It's purely due to the design and intent of each system as primarily read-focused systems. mmap will work just as well as any other shared memory interface when it comes to reading and writing. Pick nearly any write-performant database - it's using mmap.
"Evolving APIs are evidence that people are unhappy with these interfaces"
Not quite. The actual truth is that POSIX plays very well with extensions to its interfaces and this is yet another dimension in which the above article is demonstrated to be very, very silly.
> This is absolutely not true. Those systems work just fine with even a million reader threads.
I suppose it depends how you define a "client". If you define it as a query (or an HTTP request or something like that), then I suppose you're right, but you're also not mmap()ing all the time, which is what we're actually talking about here. But if you define it the way databases traditionally do, it is itself a full-featured application, so 100 clients doing 1m queries a second is 100m queries per second, and no, mmap() isn't fast enough to do that, and since we're talking about mmap() I didn't think I needed to explain that.
I must be reading a different comment, because I don't understand exactly what that has to do with anything, but I think if you're not reading "every word" of the article, maybe I'm thinking you're not reading "every word" my comments either. I was only under the impression you thought the sockets API and mmap() were ideal interfaces.
What position do you think I have that you think I need to change?
> Not quite. The actual truth is that POSIX plays very well with extensions to its interfaces and this is yet another dimension in which the above article is demonstrated to be very, very silly.
I don't understand. That POSIX is so big is one of the first complaints in the article given by implementors of embedded systems, and the "extension-like" nature of POSIX works directly against that. Then, once you have this nice "extension" like iouring with its provided buffers and files, which is basically like a whole io-state-machine operating system in zero system calls, and you know, maybe it can be part of the POSIX standardization process and make POSIX better, or maybe it just becomes popular enough there's a kevent ring extension for the bsds and there's a libiocp4 that unifies the differences, and you think that's all basically POSIX anyway, so what's the point?
Well, the point is that shouldn't stop people from imagining a system that didn't do anything else -- how much smaller the code would be, and cheaper the hardware (less ram, less rom), and maybe if you can organise the kernel with this model in mind, how much faster it would be. These sorts of ideas you can really only explore once you've mentally prepared yourself for there not being a POSIX.
The article defines client as another thread or process - the argument is that these interfaces only support two processes and not many. There isn't any functional limit on how many processes can map the same region of memory.
"but you're also not mmap()ing all the time, which is what we're actually talking about here"
No, I don't think that's at all what we're talking about. We're talking about the ability of the architecture to scale -- not the efficiency of calling one specific syscall.
"I was only under the impression you thought the sockets API and mmap() were ideal interfaces."
I haven't said that they're ideal, only that they're demonstrably capable of enabling more than just two processes to interact.
"and you think that's all basically POSIX anyway, so what's the point?"
The point is that the article's conclusion that POSIX is preventing progress is nonsense.
As you say, standards emerge.
"how much smaller the code would be, and cheaper the hardware (less ram, less rom), and maybe if you can organise the kernel with this model in mind, how much faster it would be."
These aren't points raised by the article -- in fact so far we've been discussing the opposite (extensions and features, not cutting features).
Could things be cut from POSIX? Sure. But is implementing stuff like select() really a "yoke?" It's not exactly difficult, and I don't think you or anyone has articulated how it might hamper the inclusion of more advanced interfaces. The resources needed to implement POSIX are not significant.
What specifically do you think POSIX's architecture is preventing us from implementing?
> The article defines client as another thread or process
It does no such thing. The word "client" is given once in the text, and little is said except it isn't a thread due to the fact their support is fraught with peril and often poorly provided for by programming languages
> I haven't said that they're ideal, only that they're demonstrably capable of enabling more than just two processes to interact.
So was carrier pigeon, but sometimes we want to send messages faster than that.
> These aren't points raised by the article -- in fact so far we've been discussing the opposite (extensions and features, not cutting features).
>... about the ability of the architecture to scale -- not the efficiency of calling one specific syscall.
> What specifically do you think POSIX's architecture is preventing us from implementing?
Erm, they are the points raised by the article, that's kind-of why I think you still haven't read it yet. Here's the paragraph right after the one where it defines a client as not-a-thread (and after talking about one specific syscall):
The death of any of these systems is when a user asks, "What about Posix? How can I port my XXX program to run on your system?" Providing Posix-like semantics in these systems is their death knell, because the Posix way of thinking is so narrow and providing its varied illusions requires so many hidden changes and adherences to age-old assumptions that there is no way to have a flexible innovative system and serve the Posix elephant. This means that no matter how innovative and clever a system is, once it is tainted by Posix support, it becomes just another Posix system.
> But is implementing stuff like select() really a "yoke?" It's not exactly difficult, and I don't think you or anyone has articulated how it might hamper the inclusion of more advanced interfaces
Could you show me? I don't think it's possible to implement a select() that works with Linux's fancy new iouring that uses the provided files and buffers features easily, but I'd like to be proven wrong.
It does. It says quite clearly, quote "If you look at most inter-process communication mechanisms, you can see that they are most often used by only two programs — usually a client and server." Two processes on a local machine, using shared memory or other IPC. IPC, not network calls.
"So was carrier pigeon, but sometimes we want to send messages faster than that."
Don't be disingenuous. You haven't articulated a faster mechanism and in fact there isn't one.
"Erm, they are the points raised by the article,"
No, they are not. Based on your replies it seems you may fundamentally not understand the points raised by the article.
"Could you show me? I don't think it's possible to implement a select() that works with Linux's fancy new iouring that uses the provided files and buffers features easily, but I'd like to be proven wrong."
You've already provided an example. The Linux kernel provides both select, a posix interface, and io_uring. Nothing about providing posix interfaces precludes providing specialized interfaces as well.
> It says quite clearly, quote "If you look at most inter-process communication mechanisms, you can see that they are most often used by only two programs — usually a client and server." Two processes on a local machine, using shared memory or other IPC. IPC, not network calls.
Two processes! Not two threads!
> Don't be disingenuous.
How would you know? You still haven't read the article, so you're arguing with a straw-man!
> You haven't articulated a faster mechanism and in fact there isn't one.
"la la la my fingers are in my ears I can't hear you!" is the best you got? You seem to admit iouring exists, isn't POSIX, but what? You just don't believe it's faster now? Bull shit.
> You've already provided an example. The Linux kernel provides both select, a posix interface, and io_uring. Nothing about providing posix interfaces precludes providing specialized interfaces as well.
If you think a reimplementation of Linux is "easy" you need to have your head examined: Microsoft has near-infinite money and eventually gave up, deciding POSIX was too hard, and just implemented another hypervisor.
And I wrote "thread or process" because, architecturally regarding IPC, the distinction is unimportant.
"How would you know?"
Due to the inaccuracy of your analogy, with regard to actual systems interfaces.
"You seem to admit iouring exists, isn't POSIX, but what? You just don't believe it's faster now? Bull shit."
io_uring is not faster than shared memory via mmap.
"If you think a reimplementation of Linux is "easy" you need to have your head examined"
I think now we're touching on the limits of your experience and expertise. Implementing select(2) on Linux is indeed quite simple. Have you ever read through select.c?
Have you ever even implemented a posix subsystem? I have.
"Microsoft has near-infinite money and eventually gave up, deciding POSIX was too hard, and just implemented another hypervisor. "
This isn't how things work on Windows.
As an aside, I've noticed that your ignorance has shifted into rudeness. Don't bother to respond again. This conversation is over.
So POSIX as a technology may be lacking but honestly what’s impressive about it to me is that it’s also a standard, not perfect there either, of course.
I think any conversation about replacing POSIX must include an effort to update or replace the standard. Deprecate the parts that no longer work well and add things that other systems have de-facto agreed upon, at least.
Kids today: want to come along, throw away everything they don't understand, and rebuild what was already there but worse. Accomplishing negative impact.
I like how Chesterton’s Fence example actually is a huge pro to not letting unquestionably the fence in the middle of the road.
If there is no documentation on it, no-one around to pro-actively explain why it’s there, then the most economical way to see if it had any positive value is to stash it and see what happen.
That doesn’t mean it’s always the most rational thing to do, of course. Throwing POSIX away doesn’t qualify with a situation where there is a lake of provided documentation or a missing obvious purpose.
The fallacy is well confined in these two sentences:
>However, before they decide to remove it, they must figure out why it exists in the first place. If they do not do this, they are likely to do more harm than good with its removal.
Actually, assuming you don’t know consequences of removing the fence, you also have to assume that you don’t know consequences of letting the fence in place, which might just as well turn to a dramatic situation. What is for sure, if nothing around can explain its purpose, is that it was placed here by people to negligent to induce the right clues about it. So, who know to which level their laxity might have reach?
Remove the fence unless someone know why it was put up in the first place, and the rational still hold.
Story time: at my previous job we had an old server called "the black box" which hosted some old cronjobs and various services from about a decade earlier. There was no documentation and everyone who had worked on it had left years earlier. The ops team dutifully kept it online but otherwise didn't touch it, and no other team owned it. In short, it was the ideal candidate for your "reverse Chestertons fence" strategy. At some point we had to delve into what was actually on it, and it turned out that both company payroll and a service responsible for about 50% of company revenue was on that box. Simply removing it to see what happened would have easily cost a few million in revenue, even if it could be reinstated from backup a day later (and I'm doubtful if we could have restored it at all without serious data loss). Now obviously having such a server is a bad practice in the first case, but if you find yourself in such a position "just kill it and see what happens" is definitely not the most economical option in all cases.
In the case of POSIX it might hold, since probably the worst case is that some volunteers throw away a few years of their life hacking away on yet another OS that will never gain traction. Humanity spends a lot more time on a lot less worthy endeavors.
Around here, on HN, Chesterton's fences pops up so many times as some type of ageless nugget of wisdom.
But another way to look at it is that it's just the lazy guy's excuse to allow bit rot. Because, let's be honest, we only care about Chesterton's fence in the context of software development.
Who among us did not work with software with dark corners, with files with no commits in the last 10 years, with playbooks that people just go through mechanically, because there's no one from the "old guard" who remembers why things are the way they are?
Who would not love to live in an alternate reality, where each Chesterton's fence was removed at the first opportunity, rather than left there to burden future generations of unlucky toilers? Where code is clean, documentation is up to date, tests complete, colleagues who know what's going on? Well, that alternate reality does not exist, of course.
But at least let's stop thinking we are wise whenever we postpone deprecating something we don't understand, just because some semi-famous essayist came up with a nice figure of speech one hundred years ago.
It is completely crazy to expect people to "pro-actively explain why it's there" -- how do you see this? A plaque at every fence with explanation? A person standing by fence every day, just in case someone wants to tear it down? Do you expect each shared object to be labeled and documented?
Whomever wants to tear the fence down has to do basic research -- visit city hall for the plans, talk to old-timers, check the old newspapers, etc... If they do it, this qualifies for "understanding". And yes, maybe the fence was put by Crazy Old Bill who was paranoid about neighbour's goats... then yes, tear it down. But make sure it is really him first, and not some better reason.
> assuming you don’t know consequences of removing the fence, you also have to assume that you don’t know consequences of letting the fence in place,
Note that at some moment there were no fence, and yet someone has expended non-trivial effort to put the fence up. This means someone had a reason do to so, so whomever that past person was, they knew more than you did right now. Therefore, unless you believe that you are smarter than most other people, you should defer to their past decision.
(And if you believe that you are smart that most other people, and yet want to tear down the fences you don't understand... I have bad news for you :) )
Well, for every fence at the middle of a path that have for example a a commemorative value, that seems a good start. And "warning do not remove" is a rather common thing on places where a danger is expected otherwise.
>Whomever wants to tear the fence down has to do basic research -- visit city hall for the plans, talk to old-timers, check the old newspapers, etc...
I think that qualifies in "someone know why it was put up in the first place, and the rational still hold".
>Therefore, unless you believe that you are smarter than most other people, you should defer to their past decision.
I don’t think smartness matter that much here. It’s all about blindly rel invariably in a cargo cult fashion about everything or keep critical thought vivid when it seems possibly relevant. Someone can be extremely smart and still follow rigorously the cult of the day.
-----------
Some thought experiment now. :)
Say you won a new house, congratulation!
In the middle of the garden, crossed by a path, there is a fence with no door, easy to circumvent, serving no apparent purpose. The planes you where given with the house, duplicate of those from the city hall, don’t mention it. Now, there might many explanation for this situation. One possible rationale would be, the fence used to be larger, possibly even forming a closure some time in the past. But no one know. Will you let let the fence disrupting the path in the middle of the garden? Will you call some expert to diagnose the soils, just in case it might collapse would someone remove the fence?
Also, in the middle of the living room there is a thick wall up to the ceiling, with large space to pass on both side. It’s not mentioned in any plane. Having a unified big living room would be fare more nicer to you. Will you destroy the wall without delay? Or call an expert to determine if this is a bearing wall first, without which the house would collapse?
So this is really largely dependent on context, as usual.
> Actually, assuming you don’t know consequences of removing the fence, you also have to assume that you don’t know consequences of letting the fence in place, which might just as well turn to a dramatic situation.
That assumption reads as false to me. There should generally be historical data around the costs and consequences of maintaining the fence. E.g. if you're shutting down a server, it's generally easy to figure out how much the hardware costs, how many man-hours you spend on maintaining it, etc. None of that requires knowing why the fence exists.
Those costs and consequences tend to be small, because large costs have to be continually validated as a good use of resources so they have a living memory of sorts.
> What is for sure, if nothing around can explain its purpose, is that it was placed here by people to negligent to induce the right clues about it.
Again, I don't think this is sure. I've seen lots of cases where docs probably did exist at some point, but the writers left and successors lost the knowledge of where the docs are or failed to migrate them to a new system or etc. They're lost to attrition, not to negligence.
> Remove the fence unless someone know why it was put up in the first place, and the rational still hold.
It doesn't, because it's trading known risks for unknown and unbounded risks. That's usually a bad trade, especially when the known costs tend to be small. Nobody wins a promotion for causing a million dollar outage while trying to get rid of a $50/month maintenance cost.
Ok, these are all good points, thank you for sharing it. Just as the ones by WJW.
More broadly my underlying point was more as "don’t blindly follow the herd obedience".
If there is a small black box in multi-million generator stream process, no one of course want to be the guy unplugging it just to see what happens.
On the other hand, if really no-one knows the real impact of such a black box, given the resources at play, risk management should induce an audit and possibly a replacement management. To my mind that sounds very different from "no one know why this thing is there, those who dare to ask will be chastised with a burden of proof to carry alone".
Why are you putting the onus on someone else to provide clues that you understand?
You still don't escape the requirement to understand the problem before fixing the problem. Even if the explanation for the current solution is lacking, you still need to understand enough to articulate why it's lacking.
Part of the process of erecting your own identity, both individually and as a generation, involves flatly refusing and denying the practices of your immediate predecessors (eg: your parents' and their generation of men, source code written by programmers before your time, etc.).
It's stupid and almost always leads to unintended (and usually negative) consequences, but it's probably something strongly ingrained in our instincts because it rears its ugly head every time there's a generational turnover.
If you do not understand why this happens, do not call it stupid. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to call it stupid.
It's stupid in the sense that it's a waste of time and resources.
Sooner or later after you've had your fun refusing and denying your predecessors, you realize why they do things the way they do, become wiser, and go back and undo and redo all the fuckery you caused with your refusing and denying.
It would be nice if we could get that realization without first having to fuck everything up, y'know?
Even in this description you use the words "you realize" and "become wiser" and then describe it as "a waste". It's not a waste, it's just the cost of education.
As the saying goes: "good decisions come from experience, and experience comes from bad decisions". It would be great if we could all come out of the womb with enough education and experience to function as responsible adults in modern society, but we don't. Society (rightfully) allows people some leeway in doing dumb shit so that they can learn firsthand why it's dumb.
The experience is absolutely not a waste, don't get me wrong. What I'm calling a waste are the time and resources spent undoing and redoing everything.
It's all cute if this just happens during our childhood years, where most actions hold no real consequence. It's when this extends into the real world with very real consequences that it gets really wasteful.
Well we do this for many situations where it does actually matter. Drivers license requirements force people to "do it right" the first time, you can't (legally) fuck around driving a car on public roads until you eventually figure it out. Airline pilots and nuclear power plant operators have even stricter requirements.
Licensing and training has a very real cost as well of course, and for many topics the cost of forcing everyone to get a license would outstrip the benefits. So you either have to spend on training, or you have to spend to fix the mistakes of untrained people, or a mix of both. There is no way to bring the waste to zero.
Surely if you don't understand why it's stupid, you need to go away and think about why it's stupid, once you do see why it's stupid, I may allow you to challenge people calling it stupid.
The problem with making a shell more like a "real" interpreter REPL is the purpose of a shell: Running arbitrary programs not known to the author of the shell, such that any unknown word in a shell script is probably the name of a program as opposed to a misspelled reserved word or variable name. Those programs having arbitrary command-line arguments and having to operate on files with arbitrary names is another complexity.
It's interesting to see how so many things grow into each other. A shell is what you say, but there are many efforts to change that. Bash has its magic autocomplete which knows about options of many, many programs. Powershell has an object-model with knowledge of many of its commands.
But why? Storage is much cheaper today than it was in the 80s. While I myself don’t like huge Electron apps, trying to save every byte of storage is simply not worth it.
Many embedded systems hardly have any shell, and even when they do, it isn't bash (which isn't POSIX anyway), rather a tiny subset with the basic tooling for maintenance.
I think it depends on your experience and what you have gone through, in my case how many of the decisions I've made that looked excellent to me back then, turned out to be okish at most.
It's probably worth noting that the author is in his 50s and is fairly familiar with POSIX (he has been involved in the FreeBSD developer community for decades).
(smirk) You've noticed that there are lots of kiddies, getting all hot under the collar, who haven't noticed that a counter-example does not refute the global mindset that is Posix ;)
It's fun to build stuff. It's not nearly as fun to use and maintain what other people have built. Why should all the fun be for the people who just so happen to have been born decades earlier? The point of life is not to maximize efficiency in all respects, and people make emotional choices to achieve personal fulfillment.
There's a difference between (re-)building stuff, for fun, learning purpose or even to use in production in your own projects, and dunning-kruger demanding (exaggerated) that existing solutions that provide stable APIs for multiple OS are thrown out, causing a massive amount of forced work (i.e., not fun) to others while getting a worse end result.
Also, it can be fun to explore existing ideas and build upon those too - nobody forces you to maintain anything existing "for fun".
My problem with this is that I feel like we (people coming along later) have three options when it comes to designing systems given the existing state of posix / linux / the web / etc:
1. Build inside the ecosystem as it was designed. Eg, make a static webpage using HTML. Make a linux binary using the standard linux tooling + apt (or whatever). This is almost always a good idea when you can, but sometimes the ecosystem is a bit shonky, or misaligned with what you're building. Eg, I want my program to have a fixed execution environment and instead I have debian and macos. I want a react-like API but the browser doesn't provide that. Etc.
2. Build a new thing on top of the ecosystem as it exists. Eg, build docker on top of linux. Build web frameworks. Invent the web browser, or couchdb views, or electron and build your software on top of that. The problem with this approach is that if each generation layers their own rubbish on top of the underlying layers then software will keep getting slower, more complex and more buggy over time.
3. Change the underlying system to work how you want it to work. Instead of using docker on top of ubuntu, replace ubuntu with nixos. Instead of using dpdk for higher storage performance, add io-uring to the linux kernel.
I think I would prefer it if more people did option 3 rather than option 2. I think there's lots of ways modern computing systems could be better. I'd rather if linux gained the ability to run binaries from the web (for example) than we invent a new kind of binary in userland (docker containers) and run those. I don't want people in 100 years to still be using linux (with all the bad decisions it contains), but with 10 more layers of different generations' ideas stacked on top. (Oh, its linux - but with docker, and then most people run this specific container to run languageX, and that lets us run electo2050 apps, which in turn you can use to run IRC. It only uses 50 gigs of ram. So small and light!)
Option 3 may require us to throw out or deprecate parts of the POSIX API. That may be overdue.
You present 3 independent Choices, but surely these are stages.
1. Can you do X with existing tooling?
If not
2. Can you add a tool to your set to do X?
If not
3. Get a new tool set.
2 should also serve as testing to show the utility and iron out the links of things that make it to 3. Jumping straight to 3 likely means you end up with an ill thought out solution that soon needs another 3.
Sorry; that’s not quite the point I’m making. The difference I want to emphasise between 2 and 3 is whether the new tool is layered on top of the existing tool, or made by modifying the existing tool. If you want a new approach to storing data, do you make it by writing a new kernel module, or by doing some new thing in userland on top of the filesystem?
All of docker’s features could be built into the Linux kernel. And it would probably be better as a result, and have ancillary benefits to existing Linux programs. But docker was built as a separate userland tool instead.
Building a new thing is usually easier, since you don’t need to dive into the politics of an existing community. But if each new generation of software engineers builds new software by layering on top of the old stuff, then we end up with layers on layers of cruft and our computers run slower and are more buggy as a result.
The POSIX apis aren’t very good in lots of ways. With a lot of complex effort in userland I can make a database with atomic write semantics. But maybe we should just add atomic file operations to Linux instead. In the long run I think it would be safer and result in faster programs in most cases.
Yes. Docker has proved there's utility in what it does. Assuming kinks in the implementation have been ironed out, now is the time to start thinking about making it a kernel module.
If you just put every half thought out, poorly implemented idea straight in the kernel, you'd have a 5 petabyte kernel.
This opinion piece is the epitome of a negative impact work. It's easy to accuse random things of being all wrong, but it's far harder to actually present something you feel is right.
Not every criticism of something has to be accompanied by a better solution. Pointing out that something has shortcomings is the first step. It’s perfectly fine to say, “That person’s singing is awful,” without also adding, “They need to work on their breath control.” You don’t need to be a great singer to know when someone’s doing a bad job at it. (And many people who are great at what they do are terrible at teaching others to do it, anyway.)
"This food is terrible" merely needs to be true, and anyone can say it, even people who aren't chefs. This is a ridiculous barr that only sorta kinda sounds reasonable if you just want an excuse not to like the message or the messenger.
> "This food is terrible" merely needs to be true, and anyone can say it, even people who aren't chefs.
If you bitch about how "this food is terrible" but you do nothing to make it more palatable or order something else and instead just stand there eating while complaining and reordering and asking for seconds... The food is not terrible, and you just enjoy bitching.
I would have warned you about the pot hole that destroyed my car instead of letting you get nailed by it too, but that would have just been bitching about something without fixing it, and I know you don't have any use for that.
Then why do company constantly ask me to "score this phone call", "score this driver", "was this person helpful?", "what did you think of this recommendation?", "could you spend some time to answer this questionnaire what you thought of that widget you bought"?
So to me it seems like criticism without suggestions is very valuable, it is called data. Suggestions from amateurs are usually bad and misleading, just highlighting a problem so that the expert can come up with a solution is the most efficient way to work in most cases.
I suspect a lot of that data provides very little value.
If you don’t know why the customer was unhappy, what do you change? You can fire the customer service rep, but then what do you do when the next one gets reviews just as bad because they are following the same script?
> Pointing out that something has shortcomings is the first step.
Talk is cheap. If even that same critic feels that his solution to his problem is not worth pursuing, the criticism adds nothing and creates no value. At best it's nitpicking about stuff that falls in Eisenhower's "not important"/"not urgent" box.
and the old folk that created all these things, don't bother to explain them, then get uppity when someone new comes along and doesn't think their creations are perfect.
It's not that hard, it's actually easy. Very Easy to get into a devolved state where the shortest path back to sanity is to clone the whole repo from scratch.
It's easy if you always do exactly the right thing in the right order and never make mistakes, but if you for example want to undo a git merge[1], god have mercy.
In doubt, I create a new branch from the main one, and test it goes when merging the one holding changes. It something goes wrong, at least the main branch is unaffected.
Of course, you must already have some intuition that something could go wrong. :D
Yeah, KV is GNN's vehicle for excessively controversial (i.e., what I might call "bad") takes.
I don't think he's wrong that the POSIX synchronous IO model isn't a great fit for modern hardware, though. He doesn't really go into it much but (Windows) NT's async-everything IOCP model really seems to be the winner, as far as generic abstractions that have stood the test of time.
Concurrency is hard. If you don't have anything better to do but wait on a single file operation (which, in a CLI tool, you might not), then a synchronous call is just fine. If you do have multiple I/O operations to issue at once, that can still be synchronous with writev().
Most forms of async programming are also unstructured, which is bad for correctness, but also for performance since it can lead to priority inversions.
Yeah, I agree that's an area that needs work. I admit I didn't bother to read every word, but I did search for "async" to see if he mentions aio(7) to complain about it and I saw that he didn't.
Don't get me wrong, there are lots of room for improvements in POSIX IO interfaces. For example, POSIX doesn't define any modern event systems (epoll, kqueue, etc). But what's the result? We use libevent/libev.
This sort of seems like the result he's asking for in diverging from POSIX, which we ... already have. There are a lot of pretty great non-posix standard interfaces available too!
> I don't think he's wrong that the POSIX synchronous IO model isn't a great fit for modern hardware, though.
Opinions don't really matter. Solutions to real-world problems do.
To me, the fact that no one up to this day felt that the issues in this area was significant enough to warrant a fix or an alternative tells me that it's noise about nothing.
> To me, the fact that no one up to this day felt that the issues in this area was significant enough to warrant a fix or an alternative tells me that it's noise about nothing.
Do you not view io_uring or Windows IOCP as alternatives? Or the plethora of other non-POSIX IO extensions in Linux, such as splice and sendfile? SPDK and DPDK? There’s also Google’s Fuschia operating system, which is not Posix. I think it’s pretty clear that people are investing in a variety of alternative approaches.
I mean if what they want is a better standard, so there can be a dozen compliant operating systems we can all compile our stuff for, then maybe.
To look at things from another angle, what if having had POSIX all these years actually made things more uniform, in the sense that we didn't have variance between systems as large as e.g. between a BSD Unix and a Lisp machine? What if that were our choices today. As it is the largest mainstream departure is maybe Windows.
> To me, the fact that no one up to this day felt that the issues in this area was significant enough to warrant a fix or an alternative tells me that it's noise about nothing.
yes, let's just ignore the mountains of people who end up literally reimplementing complete network stacks in userspace to handle events in a modern way
People/projects can and do implement userspace networking stacks. For example, gVisor implements its own userspace networking stack [1] isolated from the kernel networking stack.
Because usually they are implementing a subset of the network stack that meets their needs for communication within their internal network. Porting the entire kernel stack that actually handles all of the complexities and quirks of the open internet is a larger task than they needed to tackle.
NT's async-everything IOCP model really seems to be the winner
The author talks about plumbing to data processing ratio. Let’s take hundred average programmers, let them plumb IOCP and see how many of them can even get to the “data” part.
I really disagree with the sockets bit. Sockets are designed for networking, with a particular focus on IP. Not IPC.
I for one think that the 'absurd thing' is that IPC is not built into OS as a core feature. That, and process isolation. Both sockets and shared memory are quite problematic and the challenge of 'true IPC' that works nicely with threads etc. is real.
POSIX and its derivatives build IPC into the OS as a core feature. In particular, POSIX is built around memory mappings and file descriptor inheritance, which means it is extraordinarily easy to make processes communicate.
I honestly have no idea what you mean by this statement.
Unix domain sockets are (1) fast, (2) primitive (just a file), (3) widely available, and (4) can be used between multiple processes extremely easily (use SOCK_DGRAM).
I honestly have no idea why someone would think that Sockets, which are designed for IP networking are an ideal solution for IPC, or why someone would think that 'shared memory' is sufficient for sharing information when it's really only part of a solution.
If it were easy than everyone would be doing it the same way and there wouldn't be a discussion about it.
With shared memory in particular, the issue gets tricky around signalling and locking aka indicating to other processes when new info is available, and when another process is accessing the data.
If anyone took just a moment to design an API that was user/developer centric, and worked back from there, it would look materially different from what is provided today.
Sockets aren't designed for IP networking. They date back to a time when it wasn't even clear TCP/IP would win vs. IPX/SPX. Or DECnet. AF_UNIX sockets have things that IP sockets can't even dream of doing (the very poorly named SCM_RIGHTS). And sockets don't really cover IP networking all that well, e.g. MPTCP and SCTP shenanigans to get multi-address & multi-stream connections.
That said, both sockets and mmap are low-level APIs. A higher level "user/developer centric" API indeed looks different.
> With shared memory in particular, the issue gets tricky around signalling and locking aka indicating to other processes when new info is available, and when another process is accessing the data.
POSIX has mutex and condition variable, is this not enough to do those things?
As far as I know mutex is not a signal that data is received, which would require some kind of threading / interrupt.
Moreover, it's a bit of a secondary thing.
The API would look something like a call function with a receive function/lambda on the receiving side, with memory possibly being 'handed over' or whatever. And so the level of abstraction would be slightly higher.
And it would be a no-brainer, well established thing. Searching for how to do that today and it's always complicated.
I’m afraid I don’t understand. If you want one process to signal to another that data is available, you have the receiver block on a condition variable. When the sender writes its data it signals the condition variable. This is all handled by the kernel, no signal/interrupt is required. Or if you don’t want blocking, you can have a shared ring buffer, with the two processes interacting with it atomically.
Completely agree. In particular, POSIX is built around the inheritance of file descriptors by their children, which means that it is extraordinarily easy to have sockets going between multiple processes. Moreover, it's entirely possible to send file descriptors over other sockets (SCM_RIGHTS). POSIX has robust IPC. I'm currently messing around with IPC on Windows... and wow, at the end of the day, even in 2023, UNIX et al are simply more advanced than Windows. It's unfortunate there's been absolutely zero groundbreaking discoveries or inventions in this field (OS dev), but the idea that we should throw away the state of the art simply because is just silly.
POSIX IPC has withstood the test of time. There is no other system that offers as rich a set of primitives.
Since windowing is built into the NT kernel, it also has support for powerful clipboard operations as one crucial type of IPC in a graphical environment (see 1st link).
I’ve been developing professionally on Linux for >15 years, and while I do like its simple aesthetic, the consistency and power of NT kernel APIs are something I miss.
There are some legitimate issues here, and some ranting.
First, memory models. The author seems to be arguing for some way to talk about data independent of where it's stored. The general idea is that data is addressed not with some big integer address, but with something that looks more like a pathname. That's been tried, from Burroughs systems to LISP machines to the IBM System 38, but never really caught on.
All of those systems date from the era when disks were orders of magnitude slower than main memory, and loading, or page faulting, took milliseconds. Now that there are non-volatile devices maybe 10x slower than main memory, architectures like that may be worth looking at again. Intel tried with their Optane products, which were discontinued last year. It can certainly be done, but it does not currently sell.
The elephant in the room on this is not POSIX. It's C. C assumes that all data is represented by a unique integer, a "pointer". Trying to use C with a machine that does not support a flat memory model is all uphill.
Second, interprocess communication. Now, this is a Unix/Linux/Posix problem. Unix started out with almost no interprocess communication other than pipes, and has improved only slightly since. System V type IPC came and went. QNX type interprocess calls came and went. Mach type interprocess calls came and went. Now we have Android-type shared memory support.
Multiple threads on multiple CPUS in the same address space work fine, if you work in a language that supports it well. Go and Rust do; most other popular languages are terrible at it. They're either unsafe, or slow at locking, or both.
Partially shared memory multiprocessor are quite buildable but tough to program. The PS3's Cell worked that way. That was so hard to program that their games were a year late. The PS4 went back to a vanilla architecture. Some supercomputers use partially shared memory, but I'm not familiar with that space.
So those are the two big problems. So far, nobody has come up with a solution to them good enough to displace vanilla flat shared memory. Both require drastically different software, so there has to be a big improvement. We might see that from the machine learning community, which runs relatively simple code on huge amounts of data. But what they need looks more like a GPU-type engine with huge numbers of specialized compute units.
Related to this is the DLL problem. Originally, DLLs were just a way of storing shared code. But they turned into a kind of big object, with an API and state of their own. DLLs often ought to be in a different protection domain than the caller, but they rarely are. 32-bit x86 machines had hardware support, "call gates", for that sort of thing, but it was rarely used. Call gates and rings of protection have mostly died out.
That's sort of where we are in architecture. The only mainstream thing that's come along since big flat memory machines is the GPU.
Few if any programs actually used segmented memory in MS-DOS as segmented memory. C compilers of the era combined both fields into 20-bit pointers. Pointer arithmetic was kind of messy but had compiler support.
> Some supercomputers use partially shared memory, but I'm not familiar with that space.
Some supercomputers do have some shared memory architecture, but a whole lot more use Message Passing Interface (MPI) for distributed memory architecture. Shared memory starts to make less sense when your data can be terabytes or larger in size. It is a lot more scalable to just avoid a shared memory architecture and assume a distributed memory one. It becomes easier to program assuming that each thread just does not have access to the entire data set and has to send data back and forth between threads (pass messages).
>> Multiple threads on multiple CPUS in the same address space work fine, if you work in a language that supports it well. Go and Rust do; most other popular languages are terrible at it.
I find OpenMP for C and C++ to be simple and effective. You do have to write functions that are safe to run in parallel, and Rust will help enforce that. But you can write pure function in C++ too and dropping a #pragma to use all your cores is trivial after that.
So this author is (rightfully) getting a lot of hate. But, "What if we replaced POSIX?" is an interesting question to me. Most people interact with POSIX through the "good parts." But, once you need to write C, which happens when you need to make low-level calls to the OS, it starts to get a little annoying. The biggest annoyance is related to memory management. A lot of the OS APIs have manual free functions, e.g. you call `create_some_os_struct` and then `free_some_os_struct`. Similarly, you end up writing a lot of your own call/free functions. This is because you need to write C to talk to the OS, but your other code is probably not in C and may not have access to the same libc your C uses. So, you need to provide an escape hatch back into your C code to free any allocated memory.
Another annoyance is that passing data between C and another language is hard. For instance, if you want to pass a Swift string in to C, you need to be careful that Swift doesn't free the string while C is using it. The "solution" to this is to have explicit methods in Swift that take a closure which guarantee the data stays alive for the duration of that closure. On the C side, you need to copy the data so that Swift can free the string if you need to keep it longer than that one block. Going from C to Swift is also a pain.
A cool thought is: what if the OS provided better memory management? What if it had a type of higher level primitive so that memory could be retained across languages? For instance, if I pass a string from C to Go, why do I need to copy it on the Go side? Why can I not ask the OS to retain the memory for me? Perhaps we need retain / release instead of malloc and free. Anyway, just a random thought.
The problem with this thought is that malloc/free are not OS primitives, they are strictly concepts that make sense to your own program. Languages like Swift and Go never use these calls at all, for example. When Swift "frees" a string that was still being referenced from C, it's very likely not the OS that will mess with it, but other parts of the Swift program.
The way programs actually interact with the OS for memory allocation is using sbrk() or mmap() (or VirtualAlloc() in the case of Windows) to get a larger piece of memory, and then managing themselves at the process level.
And having the OS expose a more advanced memory management subsystem is a no-go in practice because each language has its own notions of what capabilites are needed.
In my opinion, given that the C-ABI is pretty much the only cross language interface for transferring in memory data between components written in different languages, it is a pretty big blind spot not include into a new language a C-safe way of transferring memory.
Yes, you can open a socket or file and transfer your data that way, so someone had to then implement for the language the memory data protection mechanisms. Just expose them!
I don't think I suggested otherwise. I don't personally know of any non-esoteric language that doesn't support C interop - from Haskell to Common Lisp to Go to Java.
What I'm pointing out is that the kernel can't help with C interop in any way, since process memory is by-and-large not managed by the kernel - certainly not at the level of individual objects. The problem of sending a Swift string to C and making sure it doesn't get "freed" by Swift is that the Swift runtime might overwrite that string if it is "freed", even though it's still being used by the C code, not that the kernel might mess with it.
Even more, the risk when sharing memory with another language runtime is not even limitted to freeing. Many common languages have GCs which move in-use objects around in memory. This obviously can't work if that object was actually shared with a different runtime. In those, the object must not only be protected from being freed, but also from being moved around - hence the common concept of pinning it in memory.
I think this makes it even more obvious that the OS can't help with this task - unless we want the OS to standardize on a GC that it imposes on all programs, and essentially no one wants that.
So POSIX is both the libraries as well as the general system design. If you want to eschew all the POSIX libraries on most *NIX systems today (at least the open source ones), you can simply ... do that. In particular, the Linux kernel (and the BSDs are similar) make no assumption as to how you're managing user memory. You can call mmap to map pages and allocate memory as you like.
In fact, languages such as Go widely disregard libc (IIUC) and just roll their own thing. They still benefit from the POSIX semantics built in to the kernels that go programs run on.
At the end of the day, the main interface between POSIX kernels and userspace is a 32-bit integer (the file descriptor).
The letter isn't coherent, and I think people are reacting to that. The thesis of the letter is that "POSIX is the reason your code has annoying low-level plumbing and knobs needing attention," but doesn't explain how removing POSIX will help with that stuff. E.g. "new schedulers have to be built to handle the fact that memory is not all one thing." What does that even mean? Maybe he has a vision in his head, but it's not well articulated.
Also, the letter ends with this:
> If we are to write programs for such machines, it is imperative to get the Posix elephant off our necks and create systems that express in software the richness of modern hardware.
But such a system could still have annoying low-level knobs that need turning.
> The letter isn't coherent, and I think people are reacting to that.
I don't agree about the quality of the article, but it's the "hate" (your word) I was reacting to when I made my comment. Some of the reactions here are genuinely over the top. Add in the fact that the author of the ACM piece knows the POSIX subject matter a lot better than the people commenting, and it's just... classic late stage hn, I guess.
I’m sure he does know POSIX better than most, but that doesn’t mean we should laud any drivel he produces.
I think people are reacting to the provocative idea combined with lack of substance in the response. But the letter itself was probably an off-the-cuff remark and wasn’t intended to be consumed and debated analytically.
To those bashing the author as uninformed -- this is George V. Neville-Neil. Member of FreeBSD Core Team who wrote the book on FreeBSD. He might know a thing or two about POSIX! [1]
It's a bad article because it's too vague and doesn't clearly relate to the questioner's problem, not because the author doesn't have the proper pedigree.
One could certainly write good articles about why the POSIX API is too limiting. For example: the filesystem API is awful in many ways. I'll try to be a bit more specific (despite having only a few minutes to write this):
* AFAICT, it has very few documented guarantees. It doesn't say sector writes are atomic, which would be very useful [1]. (Or even that they are linear as described in that SQLite page, but the SQLite people assume it anyway, and they're cautious folks, so that's saying a lot.) And even the ones that I think its language guarantees, like fsync guaranteeing all previously written data to that file has reached permanent storage, systems such as Linux [2] and macoS have failed to provide. [3]
* It doesn't provide a good async API. io_uring is my first real hope for this but isn't in POSIX.
* IO operations are typically uninterruptible (with NFS while using a particular mount operation as a rare exception). Among other problems, it means that a process that accesses a bad sector will get stuck until reboot!
* It doesn't have a way to plumb through properties you'd want for a distributed filesystem, such as deadlines and trace ids.
* It provides just numeric error codes, when I'd like to get much richer stuff back. Lots of stuff in distributed filesystem cases. Even in local cases, something like how where in particular path traversal failed. I actually saw once (but can't find in a very quick search attempt) a library that attempted to explain POSIX errors after the fact, by doing a bunch of additional operations after the fact to narrow it down. Besides being inherently racy, it just shouldn't be necessary. We should get good error messages by default.
It's a great article, and it raises many major issues with our current model of computing. But it's obviously triggering, and lots of people are rushing to defend their comfort zone.
Think outside the box people ... "files", what a charming but antiquated concept; "processes" and thus "IPC", how quaint!
- a windows-first programmer who sees interoperability and composition as an encumbrance that satisfies “nerds who like to do weird shit that i don’t understand in bash”
reading from stdin isn’t challenging, nor writing to stdout. if someone can’t imagine why that might be useful, then i’d argue their journey as a software engineer is either at its end, or right at its beginning.
at the cost of potentially sounding inflammatory, “get good”.
> reading from stdin isn’t challenging, nor writing to stdout
Every time I dabble in C, I need to look up what method I need to use these days. getline? scanf? Do I need to allocate a buffer? What about freeing it, is it safe to do so from another thread? What about Unicode support, can I just use a char array or do I need a string library for proper support? What's a wchar_t again and why is it listed in this example I found online? How do I use strtok to parse a string again?
Sure, these things become trivial with experience, but they're not easy. Other languages make them easier so we know it can be done, yet the POSIX APIs insist on using the more difficult version of everything for the sake of compatibility and programmer choice.
(Modern) C++ makes the entire process easier but there are still archaic leftovers you need to deal with on nix if you want to interact with APIs outside what the C++ standard provides. At that point, you're back to POSIX APIs and nix magic file paths with *nix IOCTLs. Gone are your exceptions, your unique_ptrs, and your std::string, back are errno and pointers.
Obviously an exaggeration, but when's the last time you checked the return value of printf? I know I don't. And that's not even a memory safety bug, just basic logic. I hope nobody trusts those guys around malloc and free :)
All of which is perfectly compatible with his being incompetent, or wrong about this in particular. (For what it's worth, I don't think he is incompetent.) But what he easily demonstrably isn't is "Windows-first", and I suggest that any mental process that led you to that conclusion needs reexamining.
I'm confused as to what exactly is wrong with the notion of _jobs_ and _files_? Scheduling is hard, but modern operating systems are definitely setup to do it. I think we could probably do a better job of using realtime features and maintaining update latency benchmarks, but so many of the cycles on my PCs/mobiles are wasted doing god damned animations and updating the screen without interaction that I don't think this is really the main issue.
Programming is basically always just a matter of loading data, transforming data, and then putting it somewhere. The simplest record keeping systems do that, and the fanciest search algorithms do that. Decode/encode/repeat.
EDIT: The beauty of UNIX to me is the interoperability of the text stream. Small components working together. Darwinesque survival of the fittest command.
The beauty of UNIX to me is the interoperability of the text stream
What interoperability? Look at the man page for any simple Unix utility (such as `ls`), and count up how many of the listed command line flags are there only to structure the text stream for some other program. "Plain text" is just as interoperable as plain binary. "Just use plain text" is the Original Sin of Unix.
The Unix Philosophy, as stated by Peter Salus [1] is
1. Write programs that do one thing and do it well
2. Write programs that work together
3. Write programs to handle text streams because text is a universal interface.
The problem is that, in practice, you can only pick two of those. If you want to write programs that work together, and do so using plain text, then, in addition to doing its ostensible task, each program is going to have to provide a facility to format its text for other programs, and have a parser to read the input that other programs provide, contradicting the dictum to "do one thing and do it well".
If you want programs that do one thing and do it well, and programs that work together, then you have to abandon "plain text", and enforce some kind of common data format that programs are required to read and output. It might be JSON. Or it might be some kind of binary format (like what PowerShell uses). But there has to be some kind of structure that allows programs to interchange data without each program having to deal with the M x N problem of having to deal with every other program's idiosyncratic "plain text" output format.
Programmers tend to have a disproportionate affinity towards plain text, me included. But this is an intriguing argument so now I'm reconsidering. Maybe plain text is just someone else's unparsed junk.
This was a really insightful take on the Unix philosophy that I hadn't heard before but I intuitively agree with because of all the parsing code I've had to write.
There have been numerous projects which invent bespoke protocols for interoperability -- I give the examples of PowerShell, Elvish, and nushell.
(PowerShell doesn't use any kind of binary format AFAIK. I believe you are literally moving around .NET objects inside the CLR VM, and that representation is meaningless outside the CLR VM. This is crucial because it means that PowerShell must serialize its data structures for interoperability.)
As well as various Lisps.
The argument is how you interoperate between them. (Honest question -- please let me know.)
So ironically, trying to solve the interoperability problem in a smaller context CREATES it again in a bigger context (e.g. between different machines).
Bytes and text are fundamental because they reflect how disks and networks fundamentally work, in addition to operating system.
That does not mean we shouldn't have higher level layers on top of bytes and text, like JSON, HTML/XML, and TSV/CSV.
Those are structured data formats. You generally use parsing libraries for them, instead of writing the parser yourself.
Again, all of those formats ARE text, and that's a feature, not a bug!
The M x N interoperability problem is SOLVED BY building on top of bytes /
plain text, not solved by moving AWAY from it!
I never argued that we should move away from plain text. Indeed, one of the examples I cited of an interoperability format, JSON, does exactly that: it takes plain text and adds structure to it to make it easily parseable by machines. Overall, I'm not sure what part of my post you're arguing against. I'm suggesting that instead of having a many different ad-hoc formats, Unix utilities should agree on a few accepted serialization formats and pass information around using those. This would make our shell pipelines less complex and more robust because we wouldn't have to worry about e.g. random spaces or newlines breaking the ad-hoc parsers we write with `grep` and `cut`.
It seems like you agree with that, with the caveat that that serialization format should be built on top of plain text. That's fine. We can agree on JSON as a serialization format. It's not ideal, but it's better than the myriad of ad-hoc formats that we have now.
Sorry if I came off as "excited" -- the main thing I'm not clear on is the framing of M x N with respect to parsing.
From my post:
I also claim that parsing is an O(M + N) problem, while types can create O(M × N) problems — and often do.
So the key point is that writing M + N amounts of code is tractable, but M x N is not. This was captured in the linked "Unix vs. Google" video as "coding the perimeter" vs. "coding the area".
This is not to say that SOME people won't be annoyed by writing M + N parsers! :) Or that writing parsers is easy. It's only claiming that it's tractable for global interoperability.
-----
Of course you could be referring to some different M x N issue, with respect to a specific application, but I don't really see it, and I'd claim it isn't the dominant issue with interoperability and POSIX.
To bring it back to the original subject, POSIX addresses the M applications x N hardware platforms issue (and again, the claim is that it's a compromise, not that it's optimal).
And of course POSIX APIs are text- / byte-oriented, unlike past and future operating systems.
It tames a code explosion and make systems feasible. But yes I definitely agree that we need more structure on top of text. My view is that we need a language with first-class support for serialized tables (TSV/CSV), records/objects (JSON), and documents (HTML/XML).
CSV in particular is very sloppy, and we should take at least as good care of our "languages for data" as we do languages for code (Go, Rust, Python, etc.)
The only thing I really take away from the UNIX philosophy nowadays (I used to be a dyed in the wool fan of UNIX/Linux) is #1) do one thing and do it well. I see #2 as an ideal goal to reach but not always required. And #3 is nowadays untenable for me. If we can agree on an object exchange format (something PowerShell seem to have solved in part), then we can do much much more than relying on text streams.
The only thing I really take away from the UNIX philosophy nowadays (I used
to be a dyed in the wool fan of UNIX/Linux) is #1) do one thing and do it
well. I see #2 as an ideal goal to reach but not always required.
If you have a number of programs, each of which does one thing and does it well, those programs will need to exchange data between themselves in order for the overall system to be useful to the user. To go back to the `ls` example I used in my post above: `ls` should just list files. Why should `ls` have anything to do with sorting, when `sort` exists? The reason, as it stands right now, is that `ls`'s plain text output is too much of a pain to parse, and so it's more convenient to build sorting into `ls` itself. If you start with "do one thing and do it well", but ignore interoperability, then your program will inevitably grow additional options and subcommands until it does one thing well, and quite a lot of things mediocrely. Instead of a collection of small sharp specialized tools, you'll end up with, e.g. `find`.
Maybe it was written as a parody and we just don't get the joke? (Not POSIX, the article. POSIX survives fine outside of UNIX in embedded systems and somewhat on Mac OS and Windows.)
macOS at least is definitely posix compliant - not just somewhat. It was a big part of what was needed for xserves to be competitive (alas being shiny and aluminium was not :) ). While xserves have died, macOS remains entirely posix compliant - complete with the horrific %n format specifier (although an environment variable is needed for it to be respected in non read only format strings iirc)
It's arguable that macOS is actually POSIX compliant. I was frustrated that poll() simply refused to work on terminal devices in Mac OS X 10.1 and I had to use select() instead. And it's still not supported in macOS 13!
Can you claim POSIX compliance while actually not complying, by saying "BUG: currently doesn't comply" for more than 20 years?
The idea that your computer would be faster or easier to use if it didn't have any animations seems untrue. Providing physicality is good! Helps you understand how things are changing between two different states.
Plan 9 enhances all of these good Unix traits. Even the universal text stream: it adds support for arrays / lists, that is, streams with elements larger that one byte.
Gnn once told me , about how he had to work on plan9 . It’s an interesting topic . It’s good to see how other people think about this cs topic and see if you can borrow some ideas .
It is hard to believe that this bunch of drivel was actually available from acm.org. If a 16 year old programmer came to me with this nonsense, I might take the time to gently point them in a few directions. Being on the acm.org site ... unforgivable.
And not even the obvious faults, pointed out by others here. There's the question of an apparent complete ignorance of OS research, for example systems that rely on h/w memory protection and so can put all tasks (and threads) into a single address space. But what's the actual take home from such research? The take home is that these ideas have, generally speaking, not succeeded, and that to whatever extent they do get adopted, it is incremental and often very partial. If you don't understand why most computing devices today do not run kernels or applications that look anything like the design dreams of 1990-2010 (to pick an arbitrary period, but a useful one), then you really don't understand enough about computers to even write a useless article like this one.
That's a pretty bizarre answer to what was a pretty reasonable question. I don't even see how the question and answer are related honestly. Surely the question is more about finding a better storage format for the initial ingestion or data storage and has little to nothing to do with POSIX.
Most languages have a way to slurp a file into memory in a single function call, after all. The fact that files exist shouldn't be a barrier here.
They’re using python and C/C++ to speed up the slow bits. Can’t imagine anything more portable than that without even having to know what posix is doing under the library abstractions.
When I was programming with Java-EE I was surprised how much effort had to be put in to produce the production artifacts: .Wars and .Ears and so on. I had to use a language called Ant. I also had to write Java, JavaScript, CSS and HTML.
On the C-side it is "make" or something similar. It means you must master multiple languages, some of which may be statically typed while others are not.
I assume integrating C with Python is similarly lots of overhead which in principle you shouldn't have to do. Why can't I just write everything in a single language and be done with it? Why do I have to write a program that transforms my source-code modules into an executable?
Software is so huge that it would take you a lifetime of programming from different perspectives to get a grip on what it really is. So we are all doomed to experience POSIX through whatever programming experience we end up getting deep in.
I feel the underlying problem most software is that it's just too damn complex, so you can't fit enough of it in your head to design it how you think it should go. An average person can't go "Oh, I think the kernel should be able to do this" and then go whip it up and having an experiment running in a little bit. That's an esoteric corner full of tons of specialized and arcane knowledge that, truth be told, is completely invented. And half of that invention is workarounds for other bad inventions.
I dunno enough to just pronounce doom on POSIX, but I do feel like the rickety C way of doing things (everything centered around intricately-compiled machine code executables, incredibly dainty fragile eggshells that shatter and spill their entire guts of complexity on the world) underpins a ton of the problem.
The number of years you would need to just read, let alone grok all the hundreds of millions of lines of code that run on our system is just beyond human lifetimes now.
Well that was far less controversial than the comments here suggested.
I read a call for innovation, the general thrust of which is based on the (obvious?) argument that if you build every new system to be compatible with the old one you limit the capabilities of the new.
Of course, the customer requesting that compatibility gets what they want -- an easier time building/porting software.
Alternate reading: ~POSIX (or just having a standard) is so useful that everyone wants it everywhere all of the time and I want something better.
Yay - I too want something better. It will be initially difficult, completely incompatible. But I hope that, someday, only computer historians will discusses "files" and "ports".
> there could be better primitives for communication than a byte stream. Why can't I just send another process a chunk of memory?
shm APIs have made this possible for decades.
> the posix byte stream API is bizarrely based on a flat buffer instead of a ring buffer, which is just the obviously wrong data structure IMO.
so obviously wrong that the terabytes of existing POSIX-compatible code is broken, or limited, or something?
> Too much of the threading API is preempted threads, when cooperative scheduling (aka yielding) is so much easier to implement and understand.
if you want/need co-routines use them, and leave threads for the domains of programming where preemption is a critical part of the model.
ps. this sounds a bit more personally critical than I intend. I'm only trying to point out flaws that I see with these 3 points, not trying to suggest anything about you as the person who made them.
> so obviously wrong that the terabytes of existing POSIX-compatible code is broken, or limited, or something?
You could say the same thing about the terabytes of existing Javascript code, and most people will agree that Javascript has more than its share of "obviously wrong".
> so obviously wrong that the terabytes of existing POSIX-compatible code is broken, or limited, or something?
it is just a really annoying API for a byte stream when what you typical want to do with a byte stream is:
* read some number of bytes (lines, a parser, etc), incrementing the ring tail as you go.
* have the buffer asynchronously be filled, incrementing the (volatile) head as it goes
You might be able to just execute a single syscall "fill till eof" and the kernel would just keep your buffer full (through interrupts) without any other calls at all (although you would have to wait if the buffer became empty
Similar mechanisms could be done for interprocess communication.
> - there could be better primitives for communication than a byte stream. Why can't I just send another process a chunk of memory?
Isn't this just
write(fd, buffer, size_of_buffer)?
I'm not sure why everyone talks about byte streams being the basis here. If fd is a socket in SOCK_DGRAM mode, then buffer is received wholesale, never split. Bytestreams are not the fundamental abstraction. Files and sockets are.
> - the posix byte stream API is bizarrely based on a flat buffer instead of a ring buffer, which is just the obviously wrong data structure IMO.
Once again, this depends on the mode of the socket. you can put a Unix socket into a mode where it starts dropping packets when the queue is full.
> - Too much of the threading API is preempted threads, when cooperative scheduling (aka yielding) is so much easier to implement and understand.
Cooperative scheduling is part of POSIX. setcontext, makecontext, getcontext, and swapcontext.
It's time for these people to get off their high elephant and write or fund a substitute, instead of doing nothing and whinging constantly just for the sake of gathering imaginary internet points.
The author of this piece is a long time FreeBSD contributor[1]. Just maybe that person has relevant experience guiding their thoughts? He certainly is not "doing nothing and whinging constantly". Posix is not a religion - it's ok to look at other ideas, to listen to critiques. I mean Linux does it all the time - io_uring, all the ebpf stuff, a dozen forms of kernel bypass suggest to me that the posix interfaces aren't sufficient, or right for modern hardware, at least not in all cases.
and yet somehow, he barely manages to mention a single non-POSIX OS API in the entire piece, either existing or posited. he writes as if we're living under the spell of POSIX ("the elephant") when it fact it is perfectly clear that we are slowly, cautiously, prudently incrementally adopting new models, some of which complement the existing POSIX API and some do not.
All the things I listed are where one OS has gone away from posix for high-end server use cases. For some systems I work on, those have actually simplified a lot of stuff - I mean they are complex apis solving hard problems, but it doesn't feel like fighting OS to use them.
The author is writing in context of people who use code very differently than I do - they think about it differently too. Perhaps there are different types of API for their use cases that are less "fighting with the OS" for them, or more likely the tool makers that build the intermediate code/frameworks?
> For some systems I work on, those have actually simplified a lot of stuff - I mean they are complex apis solving hard problems, but it doesn't feel like fighting OS to use them.
I am entirely in favor of domain-specific APIs that "make hard problems easier to solve". But it is important to recognize two things about this:
1. the "complex APIs" that are right for solving one set of hard problems are frequently not the right ones for a different set of hard problems.
2. POSIX never made any claim to universality in the domain of being the right API for a given problem class. As someone else noted here in the comments, it was a bunch of people & organizations simply agreeing on the low level stuff they could agree on.
I'd make one additional point: unless the plan is to reimplement the kernel for every problem domain, which seems pretty crazy, the actual situation faced by many/most real world programmers even in a better world is going to be "APIs we want to use layered on top of APIs we (mostly) don't want to use". This is necessarily so, because the APIs that make it possible to implement the APIs that whiners like the author of TFA want to use, along with the APIs that domain specialists want, are generally APIs like POSIX that people don't seem to warm to very strongly.
1. Yes, that's something i recognized explicitly above.
2. Sure, but that provided enough foundation for other ideas like epoll/kqueue to spread between OSes. It happened because everyone saw the strange non-posix thing (i think at Sun?) and made a similar system. Even though they aren't directly api compatible they were conceptually close enough, people ended up building nice layers on them (libuv, mio, etc). Maybe that would be enough.
Perhaps even, the right answer to several of the points raised is in fact io_uring like apis spreading to other OSes too, so that a reasonable cross platform abstraction layer can be built. Perhaps it's something different.
Perhaps is creating a new standard or a new posix version or addendum or whatever.
All of this aside, I'm not here to champion the specifics. I'm pointing out that it's a reasonable discussion to have, there's no need to be upset at someone for pointing it out.
On the other hand, I see quite a few projects challenging The Way Things Are Done (Rust, NeoVim, Fish, OilShell, etc) to which there is a lot of kicking and screaming that things are fine. No need to change.
> I see quite a few projects challenging The Way Things Are Done (Rust, NeoVim, Fish, OilShell, etc) to which there is a lot of kicking and screaming that things are fine.
What you often see are half-baked, trivial re-implementations of some parts of existing utilities and then the proclamation "Rust is here and it's so much better!".
A program that amounts to some undergrad's semester project isn't going to be taken seriously - and they often are not. The existing utilities are decades old, are very good at what they do, and rarely are the cause of the supposed bugs we're trying to avoid.
Shell replacements, such as OilShell, can gain traction on their own, because a shell is mostly system agnostic. The struggle here is reaching critical mass of users, while also supporting all of the millions of existing shell scripts out there in the wild.
No one is going to make your new wiz-bang shell the system default on a popular distro without it being completely compatible with existing shell scripts, documentation, examples, etc. It's just a non-starter...
> No one is going to make your new wiz-bang shell the system default on a popular distro without it being completely compatible with existing shell scripts, documentation, examples, etc. It's just a non-starter...
False. zsh is not POSIX nor fully compatible with bash, while being default login shell in popular distros like Kali and Deepin (and macOS).
Not sure if Garuda Linux passes the "popular" threshold for you but their default is fish.
Shell scripts typically lead with a shebang, making your login shell inconsequential as long as you have the target shell available.
> False. zsh is not POSIX nor fully compatible with bash, while being default login shell in popular distros like Kali and Deepin (and macOS).
Kali is a purpose-built distro that is not a daily driver for nearly anyone.
Deepin is a Chinese distro, and while it may be popular to some, it is not a popular distro at large.
My point remains standing.
> Shell scripts typically lead with a shebang, making your login shell inconsequential as long as you have the target shell available.
Of course - but having multiple shells installed is not common. What's the point of using another shell if all of your scripts have to use bash anyway? You are still effectively using Bash, and have unintentionally exposed the problem with all non-POSIX compliant shells - they don't allow people to get work done.
> Excuse me, how is that in any way relevant here? Fuck off with that.
What on earth are you talking about?
Deepin Technology is a wholly owned subsidiary of UnionTech, a Chinese company based in Wuhan China.
It is a Linux Distro developed in China for a domestic audience, ie. used predominantly in China by Chinese citizens. It is a Chinese Linux Distro...
It may be popular in China, but it is not globally popular. Very few people would willingly choose to run an operating system developed in China for domestic consumption, for very obvious reasons.
Pretend to be offended by something else...
> Run 'chsh' and tell me how many options you see.
Ya... only bash available on my system... a default installation of Rocky Linux.
Usually people try to ensure they're correct before saying someone is wrong.
> Ya... only bash available on my system... a default installation of Rocky Linux.
RockyLinux as you know is a RHEL clone. RHEL is a very conservative distro, their original market was refugees from Solaris and other commercial UNIX distros. And I say that as a diehard fan of Redhat/Fedora minimal installation as both a server and development platform.
Other less commercial distros (Debian and deriviatives, Arch, Gentoo etc) tend to offer more shell options out of the box, as do most of the BSD's. And other options are not hard to install from the Redhat/Fedora repos as you would know.
> What you often see are half-baked, trivial re-implementations of some parts of existing utilities and then the proclamation "Rust is here and it's so much better!".
rust itself is an example. The concepts it's based on predate it by decades. When it comes to safety it's not as good as other languages and yet...
One would argue that it's time to build something new. Just look at the massive popularity of VS Code - people clearly want the modularity and power of Vim but none of the cruft or baggage.
You don't have to do that. If you're writing against Linux specifically, and your use case could make great use of Linux-specific features, use them! I've certainly done that in the past.
But I've also written a whole lot of software which really couldn't care less about the OS. It needs some way to list directories, some way to read and write files, maybe some way to communicate over the network, but everything interesting happens in code that's not related to interacting with the OS. In those cases, I love that I can just write against one API and know that, if it doesn't already just work on all posixy systems, it's at least going to be easy to port.
Part of working on something is thinking it through. Part of working on something… as eminently social as interoperable systems with thousands of disparate components and many more thousands of contributors to some parts of that milieu… is socializing the thought process you’re taking to approach whatever part(s) of the Problems you’re trying to address.
Discussing these things earnestly and with considered and well reasoned inspection of what is, and why it is, and what could be… all of that is doing something. Maybe quite a lot in fact. Certainly it’s doing more than efforts like this, to quash or interrupt that public sharing of thought process.
“Doing” is an incredibly broad category of action. But it very seldom includes categorically dismissing ideas on the basis they haven’t already produced some other more concrete deliverable.
Don't disagree. I'm not dismissing the sentiment in general. But these kinds of articles have been posted here hundreds of times over the years, these thoughts are not new, and the fact that they fail to gain enough effective traction in the real world and are just posted here for brownie points is where it starts getting grating.
For everyone one person who posts one of these, there are 100 people cheering them on, and 10000 developers in the real world who just learnt to deal with POSIX and its warts years ago, just want a paycheck, and will wait until a better API gains traction in the real world before they go asking their employer to bet the farm on it.
The thing I’m objecting to is framing the discussion as brownie points. For something or some things this complex, you either can’t or more likely should not proceed without a lot of talk. Here is a great place for productive conversation to start to form consensus. It can’t do that if planning and philosophizing and design and critique of ideas is precluded.
You're getting downvoted but you're not wrong. This article also ignores the fact that we had rich data-structured storage systems and transparent resource sharing, both built into the VMS system. POSIX-style systems absolutely destroyed the market for them, because at the end of the day they're relatively inflexible and "open, read, write, close" is sufficient to build whatever else you need.
Speaking as a working data scientist, even well-structured data usually needs heavy manipulation before it's suitable for use, and modern datasets absolutely do not fit into the system memory of a typical workstation. Secondary storage is now just an nvme cache for what we might as well call "tertiary storage," which is in the cloud, and just as unreliable as the old pallet full of tapes (not because the vendor is screwing up, but because there's a lot of network between here and there).
The problems haven't meaningfully changed in practice, they've just sort of shifted around so that different ones loom large. Unless you win the lottery and find a consistent, reliable source of clean data, POSIX isn't ever going to be the bottleneck.
EDIT: HN won't let me reply, so I'm editing this post in response to haunter's question below:
For the filesystem stuff, https://en.wikipedia.org/wiki/Files-11 and be sure to read the section about "Record Management Services" which sort of blurs the line between database and filesystem.
For resource sharing: https://en.wikipedia.org/wiki/VMScluster and remember that thanks to record-based file storage, locking can happen on the record level -- imagine instead of having to lock a whole JSON document for writing, you could lock just the key/value pair you were updating, and other programs could read/write the rest of the file safely!
POSIX was largely an attempt to get everyone to agree on the bits that everyone agreed on, which turned out to be not a lot beyond the basics.
There was no political will to standardize the layers above POSIX and the standards that attempted to do that are dead and buried (in some cases for good reasons).
20 years later there still isn't the political will towards standardization.
The original article is correct in that our programming models do need to be updated, but I think it's a stretch to blame it on POSIX.
Don't know about VMS, but I've used "data-structured systems": old macs with resources forks, and Symbian with its databases.
Thank god they are gone.
I don't want to have to learn special archival tool just for one system. I want to be able to take any data file, and back it up, make a copy, send to a different system, encrypt, archive, checksum, rename, delete, compare...
If you want database instead of a file, there are plenty! A sqlite library running
over POSIX kernel is superior to sqlite-like database in kernel, especially in terms of compatibility and features. Your "record level locking" is already possible, it's called "a database" and it is fine with POSIX.
> For resource sharing: https://en.wikipedia.org/wiki/VMScluster and remember that thanks to record-based file storage, locking can happen on the record level -- imagine instead of having to lock a whole JSON document for writing, you could lock just the key/value pair you were updating, and other programs could read/write the rest of the file safely!
That could exist right now.
But either you want think about the "plumbing" under the hood that makes that possible, or your don't. One way or another it has to be there. Even as hard as Apple has tried to pretend that their OS(es) don't have a BSD/POSIX/Mach kernel, most developers (certainly the ones who have to pay attention to cross-platform development) end up bumping into somewhere along the way.
> Secondary storage is now just an nvme cache for what we might as well call "tertiary storage," which is in the cloud
This might be true for some sort of computing, but it is absolutely not the general rule, and I would wager that it never will be. There's nothing you can do on the cloud that you cannot do on local storage except provide remote access more simply.
Every single smartphone on the market uses internet-based storage as tertiary storage. You can absolutely do data manipulation on these data stores independently of client device operation, and most do (photo resolution downgrades being the most common I am aware of).
And I agree about the plumbing, but this is a discussion around the environment of primitives commonly available, and I'm merely pointing out that richer environments than posix have already failed out of the market once.
> we had rich data-structured storage systems and transparent resource sharing, both built into the VMS system. POSIX-style systems absolutely destroyed the market for them
Can you expand on that part? Sorry I’m not an expert but curious about this part of the OS history. Or is there an article (Wikipedia) about it?
More accurately, it was cheap commodity hardware (PCs), and cheap software (open Unix source code, leading to Linux) that rapidly became powerful and "good enough" over time that eventually destroyed both the expensive workstation market as well as most very expensive high-end computing architectures.
For example VMS, there is an OpenVMS finally, but decades too late and still kept behind a gate, ensuring its demise.
Only mainframes survived from the previous era, probably due to their vastly different requirements and customers. Banks/govts for example, have no patience for "move fast and break things" and plenty of money to spare.
The "Worse is Better" piece describes the process from a design/implementation standpoint while the book the "Innovator's Dilemma" explains the process from an economics one.
> POSIX won because it was first. Simple inertia keeps it there.
UNIX only started wide distribution in 1975. The first POSIX standards were not proposed until 1988. To pretend otherwise is to attempt to rewrite history. There were complaints about UNIX API's long before everyone standardised on POSIX. Not before, not during, not after, have the elephant riders really gotten down and gotten to work. And where they have, as others have pointed out, it's things like io_uring which have emanated from the Linux community itself.
Embedded API's may be better in many respects, but that does not mean that any particular vendor's implementation should be the new basis for a standard - especially if they cannot or will not support the existing deployments out there. Software developers love a clean slate, solution providers and integrators need to be able to deploy something to their customers and be able to guarantee interoperability etc. It's a pipe dream neutered by the practical reality of the real world.
Nobody is arguing that they are completely wrong - it's just that it's long been giving off a "young man shouts at cloud" kind of smell for a long time, and it's not really that useful other than to give other people who have been hurt by POSIX a place to vent before they go off for another scrum, kanban, retrospective, or long lunch and forget about it until the next article.
> POSIX won because it was first. Simple inertia keeps it there.
Not true. OS/360 was released years before the first Unix (OS/360 in 1966; Unix development started in 1969). Multics' development also preceded Unix (indeed, the developers of Unix started by working on Multics).
It's easy to whine, but Unix' interfaces also work for a wide variety of situations.
If you think you have a better idea, implement it & try to convince others to use it. That's how things get better.
Yeah, io_uring is an interesting step in the right direction. There's also stuff like SPDK, which doesn't really satisfy the original submitter's desire for "less plumbing," but is certainly better at some workloads than traditional POSIX IO. And of course, Windows NT got this right (or at least, a lot closer to right) back in the 90s.
It's a right attitude, but you can't just start writing an arbitrary replacement and succeed. You have to understand real needs, prioritize things to make the scope realistic, and have some vision, beside technical chops. Then you have to battle-test it.
I suppose that a real force if change would gather around some real projects with specific needs, which eventually would produce a common, shareable "better alternative". Examples: io_uring, ZFS, containers, http/2, every successful programming language, etc.
But that would require effort, organising and building!
I come here for the snarky comments and the “walking away feeling good” feeling, not actual action! /s
This post feels like it falls into the XKCD trap of “soon: there are 15 standards”
If you remove POSIX, what will replace it? If the answer is “nothing” then now on earth do you compete with a “build once run 99% everywhere else” mantra?
You’d end up with IE6 and ActiveX all over again, or HFS resource forks
It never works like that. First there were implementations, often several, then they were standardized: POSIX, Common Lisp, HTML, JavaScript, etc.
Whatever will supplant POSIX is going to be something that will have been proven to work already, ready to be standardized. If it's not there yet, POSIX can reign supreme, despite all its (real) flaws.
You don't have to use just the POSIX-based APIs. There are io_uring, xdp, bpf, and more provided by Linux, for instance, exposing many more capabilities to userspace beyond POSIX-compliant APIs. You don't have to use sockets to do networking, you can get asynchronous local disk IO. What is the author waiting for, if it's the POSIX standard being too restrictive that's the root of all of their problems?
Kind of meta, but wow, "get the elephant off our neck" might be the first example of mixing three metaphors together. The only mention of "neck" that I can find is in the title, and the article only mentions getting it off our "backs, which is mixed with the metaphor of "elephant in the room". The only explanation I can think of for why "neck" would be used is the "albatross around your neck" metaphor.
Based on the other responses here, it sounds like the author should keep in mind that you can't get your elephant off your neck and eat it too, but maybe I shouldn't open that can of worms lest I attract some early birds.
Playing with windows now for some cross platform work. Trying to port my Unix domain socket service and client over.. What a mess. Win32 is an awful mess of an API. The named pipes are a sad shadow of domain sockets. Anonymous pipes are completely useless (can't run async), thus we have to name the copious number of pipes. What hell is this...
Funny that the author calls out Windows as being separate somehow. Hasn't the last 20 years of Windows been synonymous with adding more Posix-compatibility?
I have been unpleasantly surprised when I expect Posix semantics from Windows.
Think about parent PIDs. You can't assume a process's parent PID is actually still the parent.
Think about file open semantics. Try to open to a file when any other process has the file open with no sharing allowed.
Think about file delete. (Really, try to delete a file reliably in Windows.)
Think about string encoding and code pages. (Well, this is out of bounds for Posix, but closely related to expected filename semantics.)
Has it? From COM to C++/WinRT, I think Windows is moving further and further away from simple POSIX calls. Of course these platforms have mainly been designed to be used by other languages (VBScript to PowerShell) but if you want to use newer APIs in your C++ application, you're going to need to go through the new API.
Bolting Linux in a thin VM onto the OS has been done to facilitate web/POSIX devs that otherwise would buy a Mac or install Linux. It's a more modern take on bolting Windows Services for UNIX onto NT4 back in '99. The API of the Windows platform itself has only moved further and further away from POSIX as time went on.
Windows has ostensibly been, or at least has been claimed to be, “POSIX compliant” for much longer than that. I think however technically true that might be, a good heuristic for why Windows is “other” in this categorization is how frequently things are reported as broken in Windows environments when they don’t have any other platform-specific issues across many *nixes. Sure that divergence might not be specific to POSIX, but a more general “every other system from every other vendor agrees broadly on ___ but if you use Windows your environment needs special accommodations.” Anecdotally, the surface area of this divergence is basically equivalent to any project which constructs file paths, and/or executes subcommands, unless special care was taken prior to address Windows specifically.
And Windows is certainly entitled to its own particularities and idiosyncrasies! But it’s not entitled to have those and be regarded as not having them at the same time. That would be absurd.
> We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it. APIs for managing threads and access to shared memory should be re-thought with defaults created for many-core systems, and new schedulers have to be built to handle the fact that memory is not all one thing. Modern systems have fast caches near CPU, main memory, and flash memory. Soon we'll have even more memory, disaggregated, which is faster than disk but slower than main RAM.
Isn't this what Intel was trying to do with Optane? What happened with that? It seems like a great idea.
Lots of issues but mostly it didn't meet the sky high initial expectations Intel set for the product. It might've met expectations with the current generation of the product combined with CXL, but now it's cancelled and we'll never know!
Rumor has it that Optane was barely producible at all, much less profitably. They bet on a shrink that never came. That could be because R&D screwed up (Intel's process teams have not exactly impressed in recent years), because management wouldn't bear the necessary costs, or because it was just too difficult to scale (rumors also say there's an element of truth to that, the Optane fab process was weird).
People love parroting the meme of "Unix is old therefore it must be bad, we need something newer and modern, redesigned from the ground up" but it's always hard to get a concrete proposal from these folks. This article does a lot of hand waving about what's wrong with Unix and what would be better.
For example, the part about how it would be nice to assume data exists in main memory, instead of accessing it through the filesystem? That's existed for nearly 40 years, it's called mmap, and every modern operating system implements it. You can trivially write a wrapper that automatically mmaps opened files if you want, but creating a MAP_FILE memory mapping is one of the easiest things you can do in C.
Likewise the stuff about how IPC, sockets, and shared memory assume that a resource is only going to be shared between two programs. The whole statement is weird because all of those things can be used by more than two programs, and it's not clear what the author means by saying they're designed for a single receiver/sender. The Berkeley sockets API is admittedly a bit intimidating at first but that's because it covers a lot of functionality that people actually need. It can handle UDP sockets, TCP sockets, but also other protocols and more exotic things like netlink sockets, sctp sockets, and can be extended to things like John Ousterhout's proposal about replacing TCP, as evidenced by the fact that John Ousterhout actually implemented Homa on Linux. Using shared memory on Unix isn't exactly simple but that's because it's doing something complicated. When multiple processes want to share memory there needs to be some coordination to establish what needs to be mapped, to grant privileges, etc. The SysV shared memory APIs suck but on Linux things are much better now with the memfd system calls. And the vast majority of people who need things like sockets or shared memory are going to be writing code in a higher level language anyway, using libraries that abstract the low level details. The article implies that it's a problem that people need to use these abstractions in the first place, but it seems asinine to me when there isn't an actual proposal of how things would be better, and the hand waving completely ignores things like the fact that realistically low-level C APIs are needed to have any hope of writing interfaces that can be accessed by multiple languages.
Furthermore the design of Unix allows new system calls to be added. If you have an idea for something that is way better than sockets you're free to implement it and add new system calls for it. This is exactly why there are many ways to do the same thing on Linux (e.g. establish shared memory regions), because people come up with new and better ideas and the ones that are actually performant, useful, and secure are the ones that get merged and added to Linux. Maybe there really are some brilliant ideas for new ways of doing things that need a ground-up redesign and can't be bolted onto Unix, but if that's the case people should be able to clearly elucidate what those ideas are and WHY they won't work with Unix.
One final thing I'll say about Unix is that for all its warts, it's very fast. This is a point that's been made a million times before, e.g. in "Worse Is Better" and "The UNIX Hater's Handbook". There have been lots of interesting alternatives to Unix developed, but the reason none of them have taken traction is that at the end of the day no one wants to use something that is newer, less well known, less well tested, and significantly slower. Plan9 and Fuchsia are cool, but they are much slower than Linux and it's not clear if it's really possible to fix them. Users want their computer to run quickly, they want their phone to run quickly, and they want good battery life. Big companies want their applications to run quickly and spend as little on hardware as possible. All of these things mean that pretty much any Unix alternative is a non-starter unless it is at least close to the performance of existing Unix implementations, including Linux and macOS I suppose. A big explanation of the success of Unix has been that it has co-evolved alongside modern hardware. It's hard to do something that is radically different and takes advantage of modern hardware when you're writing a general purpose operating system (there are absolutely exceptions though for operating systems serving niche use cases).
I'm in the "we need something newer and modern, designed from the ground up" camp and also don't have a concrete proposal (just half-baked ideas). I'm just disappointed that with all the creative people out there in the industry, we keep just reimplementing Unix after all this time.
Have we truly reached a global optimum in system design with Unix? We're done? Forever? We're just going to keep writing new layers and APIs on top of it without even considering a redesign?
I'd like to see something different just for the sake of being different. Maybe it would be better, maybe worse, but we'll never know if we don't try.
I think nobody tries because everyone knows there’s a gigantic chicken or egg problem.
Everything is written for Unix-like systems (or Windows which is not that different) and everyone is not going to port all that software to a significantly different paradigm. So even if a new OS was great, the suspicion is that nobody would use it because no apps.
In reality I think good compatibility libraries could bridge the gap, so it would not totally be futile to try. But you’d need those compatibility layers out of the box.
Another big ugly hairy problem is hardware support. Existing OSes have vast libraries of drivers all built around POSIX-like paradigms. Getting good hardware support on a new OS is going to be a ton of work, maybe more than the OS itself.
Yes, also because the stuff has free implementations. For something to beat free it will have to be significantly better to put up with the inevitable annoyances until it matures.
I think the 10X rule applies here. It would have to be 10X better in a mixture of areas like performance, programmer productivity, ease of system administration, security, UI/UX, etc.
Weird how Windows is mentioned as one of the two programming models and then discarded. Obviously people have long-standing objections to commercial operating systems and Microsoft in particular, but it would be interesting to hear whether the many non-POSIX-y parts of Windows provide a better or worse model.
WaitForEvent() was a mini-masterpiece of API design in Windows that has taken decades to surface in POSIX-y systems, and even then not quite as nicely as Microsoft's original.
Erm, shouldn't this whole article be shortened to "I'm ignorant of Apache Arrow?"
It's one thing to complain when there are no choices. It's quite another when there seem to be genuinely good alternatives in the space (Apache Arrow is not even the only option).
Compliance with standards makes zero sense if those standards are not optimal, or if they are outright garbage. Evolution of almost everything in technology is faster than the evolution of standards. And many standards were written by people dealing with really shoddy old ways of doing things.
What’s most upsetting are the number of POSIX things that have been codified when there was likely little thought given to the original implementation. We have paved all the cow paths. Like, why can you still configure tar options without prefixing a hyphen?
Probably because a lot of scripts and other software has written that calls tar without a hyphen in the options, and it seems strange to break them for no other reason than it seems better to you. If you want to change behavior and make a better tar with saner option behavior, you can do so, just don’t call it “tar”.
I fully understand how we got here, but it is still frustrating. POSIX was not a gift from the gods, but a descriptive this-is-how-thing-are. There should not be any sacred cows, and there needs to be a way to eventually break backwards compatibility and remove the cruft.
> there needs to be a way to eventually break backwards compatibility and remove the cruft.
Actually, no. There's always a cost/benefit ratio. You can easily create a beautiful, elegant system repateadly forcing backwards incompatibility. In fact, that has been done - see the history of "Plan 9". Even the developers of Unix couldn't just throw away backwards compatibility in the name of elegance at any cost.
I think you do sometimes want to break backwards compatibility, but there'd better be a good reason for it & the pain needs to be minimized for users. If users repeatedly undergo pain from a system, they will stop being users.
One of the most important rules for the Linux kernel is "don't break userspace".
I think "zero-copy" is the actual gem which might actually move us closer to making more of our systems have "less visible" plumbing. There's always going to be plumbing somewhere and it's not so actually directly actionable to observe that we have "too much of it". There needs to be a signal to guide a process by which we remove unnecessary necessary plumbing to obtain benefit.
I think there are significant advantages that will accrue by moving toward systems designs that encourage "more zero copy plumbing" -- not totally clear on all the ways the posix legacy might prevent that but the parts of posix that do prevent that will prove to be the most important pieces to shed I think.
Consider mmap .. your perfect candidate, it would seem, to be the basis for zero copy everything. No more read/write, your files are just chunks of memory and if you are reasonably smart about it, you can end up with either zero or just single copy semantics.
Wheee!
Then you go and look at performance, and you start to notice how some programs seem to benefit from this design, and some suffer greatly for it. You start to realize that the OS's model of likely program I/O (because let's face it, your data is not actually in volatile RAM) is sometimes right and sometimes wrong. And so you start moving either towards a model that defines the data you want to read, so that the OS can do a better job of making it all work the way you imagine it should, or you start trying to bypass the OS entirely to avoid it interfering (O_DIRECT, dear friend). But then you start realizing that your lovely zero-or-low copy design doesn't play well with others, because the OS is still working for them, and you and the OS are now stepping on each other toes.
And so it goes.
Then you try to use mmap on your audio interface, precisely to avoid unnecessary data copies, and you realize you didn't even understand what memory or mmap actually meant at all.
One of the things which most fascinated me about the Mill architecture (don't bother checking for updates, it's as vaporware as ever) was the ability to pass arbitrary memory ranges across "process"/security boundaries, rather than being stuck with page granularity and requiring complex OS assistance. This would make zero-copy much easier to accomplish. Page-based memory protection is holding us back: pages are a clever solution to, well, paging to disk, but a remarkably terrible solution for everything else they're used for.
This reads like someone whose spend a lot of time thinking about the problem and a lot less time working on the problem.
The main gist is: Why is memory shared between two processes?
The answer is: you can unblock a deadlock when you have a consumer and producer by just turning one of them on and off. In all other cases you can't. All the 'solutions' to that problem make assumptions about the processes sharing state which are somewhere between difficult to impossible to keep valid in the real world.
In short: 2 is a magic number and it's not the fault of posix.
Alternatives? While I somewhat agree with some points presented in the article, the POSIX model is beautiful in its simplicity. As a good demonstration, just take a look at Zeal 8-bit OS. It has a simplistic, Unix-inspired API [1], which is very elegant, especially considering the fact that the OS targets Z80 CPU with just 64 kB of RAM.
POSIX provides a decent cross OS API, and while some of it is less than stellar this article seems pathologically opposed to posix on principle rather than due to actual valid reasons.
That said modern OSs do provide additional APIs that are higher level, but in general they are all just opaque object wrappers around the same core principles, just less tethered to the limitations of 80s era C.
I suggest that you check out WASI - WebAssembly System Interface.
It has a subset of the POSIX file API that has been modified to not have some of its downsides. (TOCTOU problems, in particular)
If i'm reading this right, the author is blaming POSIX (and by implication unix) for the size of their code?
Given POSIX is a system programming interface, aimed at no specific use case whatsoever, maybe the issue is that they are using this directly rather than going via some sort of higher level abstraction, a library framework which would provide them with the capabilities that they need?
"What if we replace POSIX" is an interesting question. I keep coming back to SerenityOS [0] when I think about this topic, a C++ based (hobby) OS that has a POSIX compatible API but also more modern wrappers around common functionality.
Exposing system APIs like ErrorOr (like the Rust Result or the Maybe monad) resolves an entire class of bugs (take fork/kill/-1 as an example [1]) as well as ambiguity like "do I check the return value for the reason, or do I check the magical errno variable?". Dropping general OS compatibility also allows for other changes, like object oriented interaction with the GUI system and other system libraries instead of relying on having the namespacing happen in the name of a function call.
I know SerenityOS won't ever replace any operating system we use today, but it's a nice demonstration of the API features we could all be using.
Sometimes I wish Windows' APIs were more POSIX-like, but Win32 has some pretty useful functions built right into the OS that you'd need to introduce DLL/dependency hell/static compilation for if you're trying to stick to POSIX only APIs. Think common concepts like "the clipboard" or "the current resolution of a display" without needing to link to a library or access a file like /dev/fb0 (is there a /dev/fb1? when do you use it?).
Mobile platforms (Android, iOS) with special lifecycles also allow for things like "the app resumes where you left off" and "the app suspends to disk when the system is under load" natively without requiring developers to build their own save/restore state mechanisms. Imagine having to build a system like that on your average Linux desktop with mere POSIX APIs, you'd go crazy with the complexity required.
Perhaps what programs runnong on Unix-based operating systems really need is a wrapper around POSIX and its low-level implications. GTK and Qt provide many such APIs for free, for example, including stuff like networking and other I/O. GTK comes with its annoying particularities (CSD, and all the other GNOME decisions) and Qt licenses are either VERY free or VERY expensive, making them incompatible with other projects.
With the Linux space transitioning from SystemV+ALSA/Pulse+X11 to systemd+PipeWire+Wayland, perhaps the space is ripe for a new, more modern wrapper library for native languages.
[1]: When fork() fails, the call returns a negative number. When you try to kill() a PID of -1, you kill every process your current UID has the permission to kill. Programs failing to fork, shutting down, and killing what they thought was a fork()ed process can accidentally end up killing all the open applications for a user.
1. TL;DR; Modern hardware is very different from a PDP-7
2. ???
3. We should design new APIs with abstractions that do not represent either modern nor PDO architecture.
I feel like the author should spend more time on 2 to make his case.
Also, the actual answer to the question posed is: you might not think data management is an important part of your job, but it’s actually more important than the analysis, so you spend more time on it.
> We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it.
This already exists in POSIX. It's called mmap, and approximately no one uses it. Maybe the article should have explored why, and suggested how to solve that problem, instead of insisting we must throw everything away and this will magically solve our problems.
Doesn't make sense! In short, the author wants to load all data into main-memory and work upon it. That is possible, if the data is small enough and the main-memory is big enough. Often it is not and therefore programmers - with enough time and experience - use paging and caching.
The actual problem of the described organization is there data organization. Not UNIX.
"The Socket API, local IPC, and shared memory pretty much assume two programs"
There's no truth to this whatsoever - it's an utterly absurd statement because:
* The socket API as applied to networking is empirically designed for arbitrary numbers of peers -- there are unlimited real world examples. The socket API as applied to local IPC is of course similarly capable.
* Shared memory - mmap() is about as generic of a memory sharing interface as one could hope for. It certainly works just fine with arbitrary numbers of clients. There are again countless examples of using shared memory with arbitrary participants -- for example all the various local file databases: sqlite, leveldb, berkeley db, etc.
"We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it."
Yes, this is exactly what mmap() already does. mmap() is part of POSIX. Use it!
"Kode Vicious, known to mere mortals as George V. Neville-Neil,"
Oh, ok.