Time to get the Posix elephant off our necks?

throwaway09223 · on Feb 23, 2023

This article is just confused and wrong. Some examples:

"The Socket API, local IPC, and shared memory pretty much assume two programs"

There's no truth to this whatsoever - it's an utterly absurd statement because:

* The socket API as applied to networking is empirically designed for arbitrary numbers of peers -- there are unlimited real world examples. The socket API as applied to local IPC is of course similarly capable.

* Shared memory - mmap() is about as generic of a memory sharing interface as one could hope for. It certainly works just fine with arbitrary numbers of clients. There are again countless examples of using shared memory with arbitrary participants -- for example all the various local file databases: sqlite, leveldb, berkeley db, etc.

"We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it."

Yes, this is exactly what mmap() already does. mmap() is part of POSIX. Use it!

"Kode Vicious, known to mere mortals as George V. Neville-Neil,"

Oh, ok.

geocar · on Feb 23, 2023

> * The socket API as applied to networking is empirically designed for arbitrary numbers of peers -- there are unlimited real world examples. The socket API as applied to local IPC is of course similarly capable.

I think "the [BSD] socket API" was designed for a thousand or so peers (FD_SETSIZE). That's why the API evolved. Now (2023) there are a lot of APIs and different choices you can make (blocking, polling readiness, sigio, aio, iocp, iouring), none of which is best at everything, but the non-POSIX apis are much faster than the POSIX api -- especially the ones that are harder to mix with the POSIX api.

> "We should be able to assume that the data we want exists in main memory without having to keep telling the system to load more of it."

> Yes, this is exactly what mmap() already does. mmap() is part of POSIX. Use it!

This is not what mmap() does, but the opposite. mmap() sets up page-tables for demand-paging. Those little page-faults actually trigger a check (demand) to see if the page is "in memory" (this is what Linux calls the "block cache"), an update to the page-tables to point to where it is and returns (or kick off a read IO operation to the backing store). These page-faults add up. What the author is looking for in Linux is mremap() and Darwin is mach_vm_remap() and definitely not in POSIX.

> * Shared memory - mmap() is about as generic of a memory sharing interface as one could hope for. It certainly works just fine with arbitrary numbers of clients. There are again countless examples of using shared memory with arbitrary participants -- for example all the various local file databases: sqlite, leveldb, berkeley db, etc.

But none of those "local file databases" can handle hundreds of thousands of clients, and aren't great even for the low hundreds(!), that's why most "big" database vendors avoid "just" using mmap(), and instead have to perform complex contortions involving things like mach_vm_remap/mremap/O_DIRECT/sigaction+SIGSEGV. userfaultfd and memfd are other examples of recent evolutions in these APIs.

You are looking for truth? Evolving APIs are evidence that people are unhappy with these interfaces, and these newer APIs are better at some things than the old, demonstrating that the old APIs are not ideal.

So we have evidence our (programming) model is not ideal, are we to be like Copernicus and look for a better model (better APIs)? Or are we to emulate Tolosani and Ingoli?

AlexandrB · on Feb 23, 2023

> You are looking for truth? Evolving APIs are evidence that people are unhappy with these interfaces,

That tracks, but this does not necessarily follow:

> demonstrating that the old APIs are not ideal.

I've been around the tech industry long enough to see several cycles of practitioners going back and forth between technologies and approaches. E.g. strong type systems (C++, Java) -> loose type systems (Python, JS) -> strong type systems (Rust, TypeScript). And each time the tide shifts there are always plenty of arguments trying to show why the previous approach was objectively worse and the new one is better.

That's not to say that nothing is moving forward, but the fact that people are unhappy with a technology doesn't mean that technology is objectively inferior to a proposed replacement. Sometimes it's just fashion.

geocar · on Feb 23, 2023

> That's not to say that nothing is moving forward, but the fact that people are unhappy with a technology doesn't mean that technology is objectively inferior to a proposed replacement. Sometimes it's just fashion.

I apologise for writing so much that you couldn't read all of it, but this part, whilst buried at the end of the first paragraph was really important:

> > the non-POSIX apis are much faster than the POSIX api -- especially the ones that are harder to mix with the POSIX api.

That they are better in this objective way is why this particular kind of unhappiness demonstrates what I say, even if you're not otherwise interested in people being happy.

throwaway09223 · on Feb 23, 2023

" think "the [BSD] socket API" was designed for a thousand or so peers "

I think a more accurate framing is that in the 80s POSIX was designed for a thousand or so peers, and later versions have expanded and scaled accordingly. Literally every interface has these types of growth patterns so this shouldn't be a surprise.

But your point, which is the same point I made below, is that POSIX is but one of several interfaces on a modern system: https://news.ycombinator.com/item?id=34905405

As I noted below, we have semi-standardized extensions from POSIX. We don't really need POSIX to codify a new event system because we have libev/libevent. This is another point the parent article gets wrong when it says we need to get POSIX "off our necks." Nowhere does POSIX compliance stand in the way of using libev, or even using mremap(). I don't agree with your conclusion around mremap() above but it doesn't matter because the salient point is that mremap() works absolutely fine within the paradigm offered by mmap() and POSIX. libev works just fine within the paradigm offered by the file/socket api.

"But none of those "local file databases" can handle hundreds of thousands of clients"

This is absolutely not true. Those systems work just fine with even a million reader threads.

Maybe you mean the databases don't perform well under parallel writes, but this has nothing to do with the shared memory interface (how could it?). It's purely due to the design and intent of each system as primarily read-focused systems. mmap will work just as well as any other shared memory interface when it comes to reading and writing. Pick nearly any write-performant database - it's using mmap.

"Evolving APIs are evidence that people are unhappy with these interfaces"

Not quite. The actual truth is that POSIX plays very well with extensions to its interfaces and this is yet another dimension in which the above article is demonstrated to be very, very silly.

geocar · on Feb 24, 2023

> This is absolutely not true. Those systems work just fine with even a million reader threads.

I suppose it depends how you define a "client". If you define it as a query (or an HTTP request or something like that), then I suppose you're right, but you're also not mmap()ing all the time, which is what we're actually talking about here. But if you define it the way databases traditionally do, it is itself a full-featured application, so 100 clients doing 1m queries a second is 100m queries per second, and no, mmap() isn't fast enough to do that, and since we're talking about mmap() I didn't think I needed to explain that.

> But your point, which is the same point I made below, is that POSIX is but one of several interfaces on a modern system: https://news.ycombinator.com/item?id=34905405

I must be reading a different comment, because I don't understand exactly what that has to do with anything, but I think if you're not reading "every word" of the article, maybe I'm thinking you're not reading "every word" my comments either. I was only under the impression you thought the sockets API and mmap() were ideal interfaces.

What position do you think I have that you think I need to change?

> Not quite. The actual truth is that POSIX plays very well with extensions to its interfaces and this is yet another dimension in which the above article is demonstrated to be very, very silly.

I don't understand. That POSIX is so big is one of the first complaints in the article given by implementors of embedded systems, and the "extension-like" nature of POSIX works directly against that. Then, once you have this nice "extension" like iouring with its provided buffers and files, which is basically like a whole io-state-machine operating system in zero system calls, and you know, maybe it can be part of the POSIX standardization process and make POSIX better, or maybe it just becomes popular enough there's a kevent ring extension for the bsds and there's a libiocp4 that unifies the differences, and you think that's all basically POSIX anyway, so what's the point?

Well, the point is that shouldn't stop people from imagining a system that didn't do anything else -- how much smaller the code would be, and cheaper the hardware (less ram, less rom), and maybe if you can organise the kernel with this model in mind, how much faster it would be. These sorts of ideas you can really only explore once you've mentally prepared yourself for there not being a POSIX.

throwaway09223 · on Feb 25, 2023

"I suppose it depends how you define a "client"."

The article defines client as another thread or process - the argument is that these interfaces only support two processes and not many. There isn't any functional limit on how many processes can map the same region of memory.

"but you're also not mmap()ing all the time, which is what we're actually talking about here"

No, I don't think that's at all what we're talking about. We're talking about the ability of the architecture to scale -- not the efficiency of calling one specific syscall.

"I was only under the impression you thought the sockets API and mmap() were ideal interfaces."

I haven't said that they're ideal, only that they're demonstrably capable of enabling more than just two processes to interact.

"and you think that's all basically POSIX anyway, so what's the point?"

The point is that the article's conclusion that POSIX is preventing progress is nonsense.

As you say, standards emerge.

"how much smaller the code would be, and cheaper the hardware (less ram, less rom), and maybe if you can organise the kernel with this model in mind, how much faster it would be."

These aren't points raised by the article -- in fact so far we've been discussing the opposite (extensions and features, not cutting features).

Could things be cut from POSIX? Sure. But is implementing stuff like select() really a "yoke?" It's not exactly difficult, and I don't think you or anyone has articulated how it might hamper the inclusion of more advanced interfaces. The resources needed to implement POSIX are not significant.

What specifically do you think POSIX's architecture is preventing us from implementing?

geocar · on Feb 27, 2023

> The article defines client as another thread or process

It does no such thing. The word "client" is given once in the text, and little is said except it isn't a thread due to the fact their support is fraught with peril and often poorly provided for by programming languages

> I haven't said that they're ideal, only that they're demonstrably capable of enabling more than just two processes to interact.

So was carrier pigeon, but sometimes we want to send messages faster than that.

> These aren't points raised by the article -- in fact so far we've been discussing the opposite (extensions and features, not cutting features).

>... about the ability of the architecture to scale -- not the efficiency of calling one specific syscall.

> What specifically do you think POSIX's architecture is preventing us from implementing?

Erm, they are the points raised by the article, that's kind-of why I think you still haven't read it yet. Here's the paragraph right after the one where it defines a client as not-a-thread (and after talking about one specific syscall):

The death of any of these systems is when a user asks, "What about Posix? How can I port my XXX program to run on your system?" Providing Posix-like semantics in these systems is their death knell, because the Posix way of thinking is so narrow and providing its varied illusions requires so many hidden changes and adherences to age-old assumptions that there is no way to have a flexible innovative system and serve the Posix elephant. This means that no matter how innovative and clever a system is, once it is tainted by Posix support, it becomes just another Posix system.

> But is implementing stuff like select() really a "yoke?" It's not exactly difficult, and I don't think you or anyone has articulated how it might hamper the inclusion of more advanced interfaces

Could you show me? I don't think it's possible to implement a select() that works with Linux's fancy new iouring that uses the provided files and buffers features easily, but I'd like to be proven wrong.

throwaway09223 · on Feb 28, 2023

"It does no such thing."

It does. It says quite clearly, quote "If you look at most inter-process communication mechanisms, you can see that they are most often used by only two programs — usually a client and server." Two processes on a local machine, using shared memory or other IPC. IPC, not network calls.

"So was carrier pigeon, but sometimes we want to send messages faster than that."

Don't be disingenuous. You haven't articulated a faster mechanism and in fact there isn't one.

"Erm, they are the points raised by the article,"

No, they are not. Based on your replies it seems you may fundamentally not understand the points raised by the article.

"Could you show me? I don't think it's possible to implement a select() that works with Linux's fancy new iouring that uses the provided files and buffers features easily, but I'd like to be proven wrong."

You've already provided an example. The Linux kernel provides both select, a posix interface, and io_uring. Nothing about providing posix interfaces precludes providing specialized interfaces as well.

geocar · on March 1, 2023

> It says quite clearly, quote "If you look at most inter-process communication mechanisms, you can see that they are most often used by only two programs — usually a client and server." Two processes on a local machine, using shared memory or other IPC. IPC, not network calls.

Two processes! Not two threads!

> Don't be disingenuous.

How would you know? You still haven't read the article, so you're arguing with a straw-man!

> You haven't articulated a faster mechanism and in fact there isn't one.

"la la la my fingers are in my ears I can't hear you!" is the best you got? You seem to admit iouring exists, isn't POSIX, but what? You just don't believe it's faster now? Bull shit.

> You've already provided an example. The Linux kernel provides both select, a posix interface, and io_uring. Nothing about providing posix interfaces precludes providing specialized interfaces as well.

If you think a reimplementation of Linux is "easy" you need to have your head examined: Microsoft has near-infinite money and eventually gave up, deciding POSIX was too hard, and just implemented another hypervisor.

throwaway09223 · on March 2, 2023

"Two processes! Not two threads!"

And I wrote "thread or process" because, architecturally regarding IPC, the distinction is unimportant.

"How would you know?"

Due to the inaccuracy of your analogy, with regard to actual systems interfaces.

"You seem to admit iouring exists, isn't POSIX, but what? You just don't believe it's faster now? Bull shit."

io_uring is not faster than shared memory via mmap.

"If you think a reimplementation of Linux is "easy" you need to have your head examined"

I think now we're touching on the limits of your experience and expertise. Implementing select(2) on Linux is indeed quite simple. Have you ever read through select.c?

Have you ever even implemented a posix subsystem? I have.

"Microsoft has near-infinite money and eventually gave up, deciding POSIX was too hard, and just implemented another hypervisor. "

This isn't how things work on Windows.

As an aside, I've noticed that your ignorance has shifted into rudeness. Don't bother to respond again. This conversation is over.

dcow · on Feb 23, 2023

Of course we should be like Copernicus.

So POSIX as a technology may be lacking but honestly what’s impressive about it to me is that it’s also a standard, not perfect there either, of course.

I think any conversation about replacing POSIX must include an effort to update or replace the standard. Deprecate the parts that no longer work well and add things that other systems have de-facto agreed upon, at least.

cvccvroomvroom · on Feb 23, 2023

Kids today: want to come along, throw away everything they don't understand, and rebuild what was already there but worse. Accomplishing negative impact.

jplona · on Feb 23, 2023

https://fs.blog/chestertons-fence/

I think this is a human thing, not necessarily just kids today.

psychoslave · on Feb 23, 2023

I like how Chesterton’s Fence example actually is a huge pro to not letting unquestionably the fence in the middle of the road.

If there is no documentation on it, no-one around to pro-actively explain why it’s there, then the most economical way to see if it had any positive value is to stash it and see what happen.

That doesn’t mean it’s always the most rational thing to do, of course. Throwing POSIX away doesn’t qualify with a situation where there is a lake of provided documentation or a missing obvious purpose.

The fallacy is well confined in these two sentences:

>However, before they decide to remove it, they must figure out why it exists in the first place. If they do not do this, they are likely to do more harm than good with its removal.

Actually, assuming you don’t know consequences of removing the fence, you also have to assume that you don’t know consequences of letting the fence in place, which might just as well turn to a dramatic situation. What is for sure, if nothing around can explain its purpose, is that it was placed here by people to negligent to induce the right clues about it. So, who know to which level their laxity might have reach?

Remove the fence unless someone know why it was put up in the first place, and the rational still hold.

WJW · on Feb 23, 2023

Story time: at my previous job we had an old server called "the black box" which hosted some old cronjobs and various services from about a decade earlier. There was no documentation and everyone who had worked on it had left years earlier. The ops team dutifully kept it online but otherwise didn't touch it, and no other team owned it. In short, it was the ideal candidate for your "reverse Chestertons fence" strategy. At some point we had to delve into what was actually on it, and it turned out that both company payroll and a service responsible for about 50% of company revenue was on that box. Simply removing it to see what happened would have easily cost a few million in revenue, even if it could be reinstated from backup a day later (and I'm doubtful if we could have restored it at all without serious data loss). Now obviously having such a server is a bad practice in the first case, but if you find yourself in such a position "just kill it and see what happens" is definitely not the most economical option in all cases.

In the case of POSIX it might hold, since probably the worst case is that some volunteers throw away a few years of their life hacking away on yet another OS that will never gain traction. Humanity spends a lot more time on a lot less worthy endeavors.

credit_guy · on Feb 23, 2023

I can't upvote you enough.

Around here, on HN, Chesterton's fences pops up so many times as some type of ageless nugget of wisdom.

But another way to look at it is that it's just the lazy guy's excuse to allow bit rot. Because, let's be honest, we only care about Chesterton's fence in the context of software development.

Who among us did not work with software with dark corners, with files with no commits in the last 10 years, with playbooks that people just go through mechanically, because there's no one from the "old guard" who remembers why things are the way they are?

Who would not love to live in an alternate reality, where each Chesterton's fence was removed at the first opportunity, rather than left there to burden future generations of unlucky toilers? Where code is clean, documentation is up to date, tests complete, colleagues who know what's going on? Well, that alternate reality does not exist, of course.

But at least let's stop thinking we are wise whenever we postpone deprecating something we don't understand, just because some semi-famous essayist came up with a nice figure of speech one hundred years ago.

theamk · on Feb 23, 2023

It is completely crazy to expect people to "pro-actively explain why it's there" -- how do you see this? A plaque at every fence with explanation? A person standing by fence every day, just in case someone wants to tear it down? Do you expect each shared object to be labeled and documented?

Whomever wants to tear the fence down has to do basic research -- visit city hall for the plans, talk to old-timers, check the old newspapers, etc... If they do it, this qualifies for "understanding". And yes, maybe the fence was put by Crazy Old Bill who was paranoid about neighbour's goats... then yes, tear it down. But make sure it is really him first, and not some better reason.

> assuming you don’t know consequences of removing the fence, you also have to assume that you don’t know consequences of letting the fence in place,

Note that at some moment there were no fence, and yet someone has expended non-trivial effort to put the fence up. This means someone had a reason do to so, so whomever that past person was, they knew more than you did right now. Therefore, unless you believe that you are smarter than most other people, you should defer to their past decision.

(And if you believe that you are smart that most other people, and yet want to tear down the fences you don't understand... I have bad news for you :) )

psychoslave · on Feb 27, 2023

>A plaque at every fence with explanation?

Well, for every fence at the middle of a path that have for example a a commemorative value, that seems a good start. And "warning do not remove" is a rather common thing on places where a danger is expected otherwise.

>Whomever wants to tear the fence down has to do basic research -- visit city hall for the plans, talk to old-timers, check the old newspapers, etc...

I think that qualifies in "someone know why it was put up in the first place, and the rational still hold".

>Therefore, unless you believe that you are smarter than most other people, you should defer to their past decision.

I don’t think smartness matter that much here. It’s all about blindly rel invariably in a cargo cult fashion about everything or keep critical thought vivid when it seems possibly relevant. Someone can be extremely smart and still follow rigorously the cult of the day.

-----------

Some thought experiment now. :)

Say you won a new house, congratulation!

In the middle of the garden, crossed by a path, there is a fence with no door, easy to circumvent, serving no apparent purpose. The planes you where given with the house, duplicate of those from the city hall, don’t mention it. Now, there might many explanation for this situation. One possible rationale would be, the fence used to be larger, possibly even forming a closure some time in the past. But no one know. Will you let let the fence disrupting the path in the middle of the garden? Will you call some expert to diagnose the soils, just in case it might collapse would someone remove the fence?

Also, in the middle of the living room there is a thick wall up to the ceiling, with large space to pass on both side. It’s not mentioned in any plane. Having a unified big living room would be fare more nicer to you. Will you destroy the wall without delay? Or call an expert to determine if this is a bearing wall first, without which the house would collapse?

So this is really largely dependent on context, as usual.

everforward · on Feb 23, 2023

> Actually, assuming you don’t know consequences of removing the fence, you also have to assume that you don’t know consequences of letting the fence in place, which might just as well turn to a dramatic situation.

That assumption reads as false to me. There should generally be historical data around the costs and consequences of maintaining the fence. E.g. if you're shutting down a server, it's generally easy to figure out how much the hardware costs, how many man-hours you spend on maintaining it, etc. None of that requires knowing why the fence exists.

Those costs and consequences tend to be small, because large costs have to be continually validated as a good use of resources so they have a living memory of sorts.

> What is for sure, if nothing around can explain its purpose, is that it was placed here by people to negligent to induce the right clues about it.

Again, I don't think this is sure. I've seen lots of cases where docs probably did exist at some point, but the writers left and successors lost the knowledge of where the docs are or failed to migrate them to a new system or etc. They're lost to attrition, not to negligence.

> Remove the fence unless someone know why it was put up in the first place, and the rational still hold.

It doesn't, because it's trading known risks for unknown and unbounded risks. That's usually a bad trade, especially when the known costs tend to be small. Nobody wins a promotion for causing a million dollar outage while trying to get rid of a $50/month maintenance cost.

psychoslave · on Feb 27, 2023

Ok, these are all good points, thank you for sharing it. Just as the ones by WJW.

More broadly my underlying point was more as "don’t blindly follow the herd obedience".

If there is a small black box in multi-million generator stream process, no one of course want to be the guy unplugging it just to see what happens.

On the other hand, if really no-one knows the real impact of such a black box, given the resources at play, risk management should induce an audit and possibly a replacement management. To my mind that sounds very different from "no one know why this thing is there, those who dare to ask will be chastised with a burden of proof to carry alone".

benj111 · on Feb 23, 2023

Why are you putting the onus on someone else to provide clues that you understand?

You still don't escape the requirement to understand the problem before fixing the problem. Even if the explanation for the current solution is lacking, you still need to understand enough to articulate why it's lacking.

eptcyka · on Feb 23, 2023

Statistically, I think it'll always be more kids doing this, regardless of day of the week or year of millennia.

Dalewyn · on Feb 23, 2023

Part of the process of erecting your own identity, both individually and as a generation, involves flatly refusing and denying the practices of your immediate predecessors (eg: your parents' and their generation of men, source code written by programmers before your time, etc.).

It's stupid and almost always leads to unintended (and usually negative) consequences, but it's probably something strongly ingrained in our instincts because it rears its ugly head every time there's a generational turnover.

incrudible · on Feb 23, 2023

If you do not understand why this happens, do not call it stupid. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to call it stupid.

Dalewyn · on Feb 23, 2023

It's stupid in the sense that it's a waste of time and resources.

Sooner or later after you've had your fun refusing and denying your predecessors, you realize why they do things the way they do, become wiser, and go back and undo and redo all the fuckery you caused with your refusing and denying.

It would be nice if we could get that realization without first having to fuck everything up, y'know?

WJW · on Feb 23, 2023

Even in this description you use the words "you realize" and "become wiser" and then describe it as "a waste". It's not a waste, it's just the cost of education.

As the saying goes: "good decisions come from experience, and experience comes from bad decisions". It would be great if we could all come out of the womb with enough education and experience to function as responsible adults in modern society, but we don't. Society (rightfully) allows people some leeway in doing dumb shit so that they can learn firsthand why it's dumb.

Dalewyn · on Feb 23, 2023

The experience is absolutely not a waste, don't get me wrong. What I'm calling a waste are the time and resources spent undoing and redoing everything.

It's all cute if this just happens during our childhood years, where most actions hold no real consequence. It's when this extends into the real world with very real consequences that it gets really wasteful.

WJW · on Feb 23, 2023

Well we do this for many situations where it does actually matter. Drivers license requirements force people to "do it right" the first time, you can't (legally) fuck around driving a car on public roads until you eventually figure it out. Airline pilots and nuclear power plant operators have even stricter requirements.

Licensing and training has a very real cost as well of course, and for many topics the cost of forcing everyone to get a license would outstrip the benefits. So you either have to spend on training, or you have to spend to fix the mistakes of untrained people, or a mix of both. There is no way to bring the waste to zero.

okaram · on Feb 23, 2023

You forget one thing ... You also discover the 20% of manure that isn't true, or doesn't apply to you.

hoseja · on Feb 23, 2023

I'll be glad to pillage your stagnant empire meanwhile :)

benj111 · on Feb 23, 2023

Surely if you don't understand why it's stupid, you need to go away and think about why it's stupid, once you do see why it's stupid, I may allow you to challenge people calling it stupid.

chasil · on Feb 23, 2023

Can we actually improve the POSIX shell?

Can we reform it, perhaps into a "D-shell," that is an LR-parsed language?

AWK had a yacc grammar for many years. Why can't this be the language of the shell, at least for the sake of maintainability and comprehension?

https://archive.fosdem.org/2018/schedule/event/code_parsing_...

msla · on Feb 23, 2023

The problem with making a shell more like a "real" interpreter REPL is the purpose of a shell: Running arbitrary programs not known to the author of the shell, such that any unknown word in a shell script is probably the name of a program as opposed to a misspelled reserved word or variable name. Those programs having arbitrary command-line arguments and having to operate on files with arbitrary names is another complexity.

actionfromafar · on Feb 23, 2023

It's interesting to see how so many things grow into each other. A shell is what you say, but there are many efforts to change that. Bash has its magic autocomplete which knows about options of many, many programs. Powershell has an object-model with knowledge of many of its commands.

pasc1878 · on Feb 23, 2023

It has been done. See xonsh a python shell which manages the distinction you mention.

Also see scsh and rash for scheme

Fish for non posix but sane syntax

deafpolygon · on Feb 23, 2023

> Can we actually improve the POSIX shell?

PowerShell?

chasil · on Feb 23, 2023

No, that won't work.

  $ ll /bin/dash
  -rwxr-xr-x 1 root root 85368 Jan  5 09:52 /bin/dash

If you can make Powershell in a binary that size, then you have a valid point.

shubhamkrm · on Feb 23, 2023

But why? Storage is much cheaper today than it was in the 80s. While I myself don’t like huge Electron apps, trying to save every byte of storage is simply not worth it.

chasil · on Feb 23, 2023

Embedded applications on smaller systems are why.

The POSIX shell is common, from the largest supercomputer down to the smallest ARM.

I don't think that there will be any new developments covering that full scope.

pjmlp · on Feb 24, 2023

Many embedded systems hardly have any shell, and even when they do, it isn't bash (which isn't POSIX anyway), rather a tiny subset with the basic tooling for maintenance.

deafpolygon · on Feb 24, 2023

Embedded systems don't exclude PowerShell from being a replacement for POSIX shells. Maybe embedded systems should not strive to be POSIX.

KingOfCoders · on Feb 23, 2023

I think it depends on your experience and what you have gone through, in my case how many of the decisions I've made that looked excellent to me back then, turned out to be okish at most.

metadat · on Feb 23, 2023

Great article.

Discussed 13 months ago:

https://news.ycombinator.com/item?id=30070757 (142 comments)

ChrisSD · on Feb 23, 2023

Some variation on this has been consistently brought up for going on a decade now. At this point it's a HN meme.

Its opposite is "cargo culting".

loeg · on Feb 23, 2023

It's probably worth noting that the author is in his 50s and is fairly familiar with POSIX (he has been involved in the FreeBSD developer community for decades).

PaulDavisThe1st · on Feb 23, 2023

which makes this garbage even more indefensible.

drpixie · on Feb 23, 2023

(smirk) You've noticed that there are lots of kiddies, getting all hot under the collar, who haven't noticed that a counter-example does not refute the global mindset that is Posix ;)

DangitBobby · on Feb 23, 2023

It's fun to build stuff. It's not nearly as fun to use and maintain what other people have built. Why should all the fun be for the people who just so happen to have been born decades earlier? The point of life is not to maximize efficiency in all respects, and people make emotional choices to achieve personal fulfillment.

tlamponi · on Feb 23, 2023

There's a difference between (re-)building stuff, for fun, learning purpose or even to use in production in your own projects, and dunning-kruger demanding (exaggerated) that existing solutions that provide stable APIs for multiple OS are thrown out, causing a massive amount of forced work (i.e., not fun) to others while getting a worse end result.

Also, it can be fun to explore existing ideas and build upon those too - nobody forces you to maintain anything existing "for fun".

josephg · on Feb 23, 2023

My problem with this is that I feel like we (people coming along later) have three options when it comes to designing systems given the existing state of posix / linux / the web / etc:

1. Build inside the ecosystem as it was designed. Eg, make a static webpage using HTML. Make a linux binary using the standard linux tooling + apt (or whatever). This is almost always a good idea when you can, but sometimes the ecosystem is a bit shonky, or misaligned with what you're building. Eg, I want my program to have a fixed execution environment and instead I have debian and macos. I want a react-like API but the browser doesn't provide that. Etc.

2. Build a new thing on top of the ecosystem as it exists. Eg, build docker on top of linux. Build web frameworks. Invent the web browser, or couchdb views, or electron and build your software on top of that. The problem with this approach is that if each generation layers their own rubbish on top of the underlying layers then software will keep getting slower, more complex and more buggy over time.

3. Change the underlying system to work how you want it to work. Instead of using docker on top of ubuntu, replace ubuntu with nixos. Instead of using dpdk for higher storage performance, add io-uring to the linux kernel.

I think I would prefer it if more people did option 3 rather than option 2. I think there's lots of ways modern computing systems could be better. I'd rather if linux gained the ability to run binaries from the web (for example) than we invent a new kind of binary in userland (docker containers) and run those. I don't want people in 100 years to still be using linux (with all the bad decisions it contains), but with 10 more layers of different generations' ideas stacked on top. (Oh, its linux - but with docker, and then most people run this specific container to run languageX, and that lets us run electo2050 apps, which in turn you can use to run IRC. It only uses 50 gigs of ram. So small and light!)

Option 3 may require us to throw out or deprecate parts of the POSIX API. That may be overdue.

benj111 · on Feb 23, 2023

You present 3 independent Choices, but surely these are stages.

1. Can you do X with existing tooling?

If not

2. Can you add a tool to your set to do X?

If not

3. Get a new tool set.

2 should also serve as testing to show the utility and iron out the links of things that make it to 3. Jumping straight to 3 likely means you end up with an ill thought out solution that soon needs another 3.

josephg · on Feb 23, 2023

Sorry; that’s not quite the point I’m making. The difference I want to emphasise between 2 and 3 is whether the new tool is layered on top of the existing tool, or made by modifying the existing tool. If you want a new approach to storing data, do you make it by writing a new kernel module, or by doing some new thing in userland on top of the filesystem?

All of docker’s features could be built into the Linux kernel. And it would probably be better as a result, and have ancillary benefits to existing Linux programs. But docker was built as a separate userland tool instead.

Building a new thing is usually easier, since you don’t need to dive into the politics of an existing community. But if each new generation of software engineers builds new software by layering on top of the old stuff, then we end up with layers on layers of cruft and our computers run slower and are more buggy as a result.

The POSIX apis aren’t very good in lots of ways. With a lot of complex effort in userland I can make a database with atomic write semantics. But maybe we should just add atomic file operations to Linux instead. In the long run I think it would be safer and result in faster programs in most cases.

benj111 · on Feb 24, 2023

Yes. Docker has proved there's utility in what it does. Assuming kinks in the implementation have been ironed out, now is the time to start thinking about making it a kernel module.

If you just put every half thought out, poorly implemented idea straight in the kernel, you'd have a 5 petabyte kernel.

simplotek · on Feb 23, 2023

> Accomplishing negative impact.

This opinion piece is the epitome of a negative impact work. It's easy to accuse random things of being all wrong, but it's far harder to actually present something you feel is right.

thewebcount · on Feb 23, 2023

Not every criticism of something has to be accompanied by a better solution. Pointing out that something has shortcomings is the first step. It’s perfectly fine to say, “That person’s singing is awful,” without also adding, “They need to work on their breath control.” You don’t need to be a great singer to know when someone’s doing a bad job at it. (And many people who are great at what they do are terrible at teaching others to do it, anyway.)

jimbokun · on Feb 23, 2023

Criticism without suggestions for improvement is almost always useless.

Brian_K_White · on Feb 23, 2023

"This food is terrible" merely needs to be true, and anyone can say it, even people who aren't chefs. This is a ridiculous barr that only sorta kinda sounds reasonable if you just want an excuse not to like the message or the messenger.

simplotek · on Feb 23, 2023

> "This food is terrible" merely needs to be true, and anyone can say it, even people who aren't chefs.

If you bitch about how "this food is terrible" but you do nothing to make it more palatable or order something else and instead just stand there eating while complaining and reordering and asking for seconds... The food is not terrible, and you just enjoy bitching.

Brian_K_White · on Feb 23, 2023

I would have warned you about the pot hole that destroyed my car instead of letting you get nailed by it too, but that would have just been bitching about something without fixing it, and I know you don't have any use for that.

Jensson · on Feb 23, 2023

Then why do company constantly ask me to "score this phone call", "score this driver", "was this person helpful?", "what did you think of this recommendation?", "could you spend some time to answer this questionnaire what you thought of that widget you bought"?

So to me it seems like criticism without suggestions is very valuable, it is called data. Suggestions from amateurs are usually bad and misleading, just highlighting a problem so that the expert can come up with a solution is the most efficient way to work in most cases.

jimbokun · on Feb 23, 2023

I suspect a lot of that data provides very little value.

If you don’t know why the customer was unhappy, what do you change? You can fire the customer service rep, but then what do you do when the next one gets reviews just as bad because they are following the same script?

simplotek · on Feb 23, 2023

> Pointing out that something has shortcomings is the first step.

Talk is cheap. If even that same critic feels that his solution to his problem is not worth pursuing, the criticism adds nothing and creates no value. At best it's nitpicking about stuff that falls in Eisenhower's "not important"/"not urgent" box.

bombolo · on Feb 23, 2023

Do you need to be a chef to realise some food is burnt?

IgorPartola · on Feb 23, 2023

Do you really have to call out the entire JavaScript ecosystem like that? :)

rstat1 · on Feb 23, 2023

and the old folk that created all these things, don't bother to explain them, then get uppity when someone new comes along and doesn't think their creations are perfect.

2 sides of the same coin.

goodpoint · on Feb 23, 2023

That's been the default for the whole software industry for the last 20 years at least.

Reinventing things without learning from the past - and if you point it out people gets so defensive.

endgame · on Feb 23, 2023

https://www.jwz.org/doc/cadt.html

thefz · on Feb 23, 2023

While complaining that Git is haaaard.

speed_spread · on Feb 23, 2023

It's not that hard, it's actually easy. Very Easy to get into a devolved state where the shortest path back to sanity is to clone the whole repo from scratch.

adastra22 · on Feb 23, 2023

Well git is hard tho

marginalia_nu · on Feb 23, 2023

It's easy if you always do exactly the right thing in the right order and never make mistakes, but if you for example want to undo a git merge[1], god have mercy.

[1] https://opensource.apple.com/source/Git/Git-26/src/git-htmld...

psychoslave · on Feb 23, 2023

In doubt, I create a new branch from the main one, and test it goes when merging the one holding changes. It something goes wrong, at least the main branch is unaffected.

Of course, you must already have some intuition that something could go wrong. :D

imetatroll · on Feb 23, 2023

Reflog lets you see where HEAD has traveled from. Just go back in time. This isn't actually a hard problem to solve.

If that sounds difficult just create a branch prior to merging so that you know where to reset your failed branch to.

BeeOnRope · on Feb 23, 2023

They are trying to solve this problem without rewriting the branch.

loeg · on Feb 23, 2023

Yeah, KV is GNN's vehicle for excessively controversial (i.e., what I might call "bad") takes.

I don't think he's wrong that the POSIX synchronous IO model isn't a great fit for modern hardware, though. He doesn't really go into it much but (Windows) NT's async-everything IOCP model really seems to be the winner, as far as generic abstractions that have stood the test of time.

astrange · on Feb 23, 2023

Concurrency is hard. If you don't have anything better to do but wait on a single file operation (which, in a CLI tool, you might not), then a synchronous call is just fine. If you do have multiple I/O operations to issue at once, that can still be synchronous with writev().

Most forms of async programming are also unstructured, which is bad for correctness, but also for performance since it can lead to priority inversions.

throwaway09223 · on Feb 23, 2023

Yeah, I agree that's an area that needs work. I admit I didn't bother to read every word, but I did search for "async" to see if he mentions aio(7) to complain about it and I saw that he didn't.

Don't get me wrong, there are lots of room for improvements in POSIX IO interfaces. For example, POSIX doesn't define any modern event systems (epoll, kqueue, etc). But what's the result? We use libevent/libev.

This sort of seems like the result he's asking for in diverging from POSIX, which we ... already have. There are a lot of pretty great non-posix standard interfaces available too!

simplotek · on Feb 23, 2023

> I don't think he's wrong that the POSIX synchronous IO model isn't a great fit for modern hardware, though.

Opinions don't really matter. Solutions to real-world problems do.

To me, the fact that no one up to this day felt that the issues in this area was significant enough to warrant a fix or an alternative tells me that it's noise about nothing.

loeg · on Feb 23, 2023

> To me, the fact that no one up to this day felt that the issues in this area was significant enough to warrant a fix or an alternative tells me that it's noise about nothing.

Do you not view io_uring or Windows IOCP as alternatives? Or the plethora of other non-POSIX IO extensions in Linux, such as splice and sendfile? SPDK and DPDK? There’s also Google’s Fuschia operating system, which is not Posix. I think it’s pretty clear that people are investing in a variety of alternative approaches.

PaulDavisThe1st · on Feb 23, 2023

so what you're actually saying is that the POSIX elephant isn't (and hasn't been) actually standing on our necks after all?

foobarian · on Feb 23, 2023

I mean if what they want is a better standard, so there can be a dozen compliant operating systems we can all compile our stuff for, then maybe.

To look at things from another angle, what if having had POSIX all these years actually made things more uniform, in the sense that we didn't have variance between systems as large as e.g. between a BSD Unix and a Lisp machine? What if that were our choices today. As it is the largest mainstream departure is maybe Windows.

jen20 · on Feb 23, 2023

None of the things mentioned are POSIX _or_ portable.

simplotek · on Feb 23, 2023

> Do you not view io_uring or Windows IOCP as alternatives?

I don't simply due to the fact that no one ever bothered to put something together on UNIX-like systems that resembles any of those options.

If anyone felt that those were worthy ventures, in these last few years we would already have seen something resembling that effort. But we haven't.

maccam94 · on Feb 23, 2023

> no one ever bothered to put something together on UNIX-like systems that resembles any of those options.

What do you mean? io_uring is a Linux kernel interface, and there are several userspace projects that are building upon it.

jcelerier · on Feb 23, 2023

> To me, the fact that no one up to this day felt that the issues in this area was significant enough to warrant a fix or an alternative tells me that it's noise about nothing.

yes, let's just ignore the mountains of people who end up literally reimplementing complete network stacks in userspace to handle events in a modern way

simplotek · on Feb 23, 2023

> yes, let's just ignore the mountains of people (...)

Who are they, exactly, and why haven't anyone from that mountain thought it would be a worthy investment of their time to pull it off?

I see you complaining that people have to implement things, but why has no one ever implemented the thing you perceive to be so invaluable?

Karrot_Kream · on Feb 23, 2023

People/projects can and do implement userspace networking stacks. For example, gVisor implements its own userspace networking stack [1] isolated from the kernel networking stack.

[1]: https://gvisor.dev/docs/user_guide/networking/

maccam94 · on Feb 23, 2023

Because usually they are implementing a subset of the network stack that meets their needs for communication within their internal network. Porting the entire kernel stack that actually handles all of the complexities and quirks of the open internet is a larger task than they needed to tackle.

wruza · on Feb 23, 2023

NT's async-everything IOCP model really seems to be the winner

The author talks about plumbing to data processing ratio. Let’s take hundred average programmers, let them plumb IOCP and see how many of them can even get to the “data” part.

jasmer · on Feb 23, 2023

I really disagree with the sockets bit. Sockets are designed for networking, with a particular focus on IP. Not IPC.

I for one think that the 'absurd thing' is that IPC is not built into OS as a core feature. That, and process isolation. Both sockets and shared memory are quite problematic and the challenge of 'true IPC' that works nicely with threads etc. is real.

anon291 · on Feb 23, 2023

> IPC is not built into OS as a core feature

POSIX and its derivatives build IPC into the OS as a core feature. In particular, POSIX is built around memory mappings and file descriptor inheritance, which means it is extraordinarily easy to make processes communicate.

I honestly have no idea what you mean by this statement.

Unix domain sockets are (1) fast, (2) primitive (just a file), (3) widely available, and (4) can be used between multiple processes extremely easily (use SOCK_DGRAM).

jasmer · on Feb 23, 2023

I honestly have no idea why someone would think that Sockets, which are designed for IP networking are an ideal solution for IPC, or why someone would think that 'shared memory' is sufficient for sharing information when it's really only part of a solution.

If it were easy than everyone would be doing it the same way and there wouldn't be a discussion about it.

With shared memory in particular, the issue gets tricky around signalling and locking aka indicating to other processes when new info is available, and when another process is accessing the data.

If anyone took just a moment to design an API that was user/developer centric, and worked back from there, it would look materially different from what is provided today.

eqvinox · on Feb 23, 2023

> Sockets, which are designed for IP networking

Sockets aren't designed for IP networking. They date back to a time when it wasn't even clear TCP/IP would win vs. IPX/SPX. Or DECnet. AF_UNIX sockets have things that IP sockets can't even dream of doing (the very poorly named SCM_RIGHTS). And sockets don't really cover IP networking all that well, e.g. MPTCP and SCTP shenanigans to get multi-address & multi-stream connections.

That said, both sockets and mmap are low-level APIs. A higher level "user/developer centric" API indeed looks different.

sakras · on Feb 23, 2023

> With shared memory in particular, the issue gets tricky around signalling and locking aka indicating to other processes when new info is available, and when another process is accessing the data.

POSIX has mutex and condition variable, is this not enough to do those things?

jasmer · on Feb 24, 2023

As far as I know mutex is not a signal that data is received, which would require some kind of threading / interrupt.

Moreover, it's a bit of a secondary thing.

The API would look something like a call function with a receive function/lambda on the receiving side, with memory possibly being 'handed over' or whatever. And so the level of abstraction would be slightly higher.

And it would be a no-brainer, well established thing. Searching for how to do that today and it's always complicated.

sakras · on Feb 24, 2023

I’m afraid I don’t understand. If you want one process to signal to another that data is available, you have the receiver block on a condition variable. When the sender writes its data it signals the condition variable. This is all handled by the kernel, no signal/interrupt is required. Or if you don’t want blocking, you can have a shared ring buffer, with the two processes interacting with it atomically.

Spivak · on Feb 23, 2023

Which is how dbus and the efforts to get it in the kernel were born.

syrrim · on Feb 23, 2023

unix domain sockets are designed to be used entirely locally.

drpixie · on Feb 23, 2023

and files are one-way sockets. They lied when they said "everything is a file", but it's almost true that (for Posix) "everything is a handle."

pjmlp · on Feb 24, 2023

Win32 is POSIX! => "everything is a handle."

anon291 · on Feb 23, 2023

Completely agree. In particular, POSIX is built around the inheritance of file descriptors by their children, which means that it is extraordinarily easy to have sockets going between multiple processes. Moreover, it's entirely possible to send file descriptors over other sockets (SCM_RIGHTS). POSIX has robust IPC. I'm currently messing around with IPC on Windows... and wow, at the end of the day, even in 2023, UNIX et al are simply more advanced than Windows. It's unfortunate there's been absolutely zero groundbreaking discoveries or inventions in this field (OS dev), but the idea that we should throw away the state of the art simply because is just silly.

POSIX IPC has withstood the test of time. There is no other system that offers as rich a set of primitives.

jcrites · on Feb 23, 2023

Can you be more specific about how POSIX is more advanced than Windows (NT kernel)? Or a primitive it offers that NT doesn’t? Examples?

Windows provides most (all?) of the same mechanisms, and a number that POSIX does not.

https://learn.microsoft.com/en-us/windows/win32/ipc/interpro...

Example Windows concepts that Unix doesn’t have (AFAIK):

Transactions on named pipes (request/response): https://learn.microsoft.com/en-us/windows/win32/ipc/transact...

Parents can control which handles child processes inherit: https://learn.microsoft.com/en-us/windows/win32/ipc/pipe-han...

Remote Procedure Call (RPC) IPC: https://learn.microsoft.com/en-us/windows/win32/ipc/interpro...

ACL-based permissions on objects that kernel APIs operate on, including pipes. (ACLs are a much more sensible way of specifying permissions on resources than owner/group). https://learn.microsoft.com/en-us/windows/desktop/SecAuthZ/a...

Since windowing is built into the NT kernel, it also has support for powerful clipboard operations as one crucial type of IPC in a graphical environment (see 1st link).

Windows I/O Completion Ports (IOCP) also provide a high-performance way to implement kernel-managed asynchronous I/O operations which has no parallel in POSIX: https://learn.microsoft.com/en-us/windows/win32/fileio/i-o-c...

I’ve been developing professionally on Linux for >15 years, and while I do like its simple aesthetic, the consistency and power of NT kernel APIs are something I miss.

pjmlp · on Feb 24, 2023

Sure there are, one only needs to actually bother to learn about the history of computing and other platforms, to validate that.

Animats · on Feb 23, 2023

There are some legitimate issues here, and some ranting.

First, memory models. The author seems to be arguing for some way to talk about data independent of where it's stored. The general idea is that data is addressed not with some big integer address, but with something that looks more like a pathname. That's been tried, from Burroughs systems to LISP machines to the IBM System 38, but never really caught on.

All of those systems date from the era when disks were orders of magnitude slower than main memory, and loading, or page faulting, took milliseconds. Now that there are non-volatile devices maybe 10x slower than main memory, architectures like that may be worth looking at again. Intel tried with their Optane products, which were discontinued last year. It can certainly be done, but it does not currently sell.

The elephant in the room on this is not POSIX. It's C. C assumes that all data is represented by a unique integer, a "pointer". Trying to use C with a machine that does not support a flat memory model is all uphill.

Second, interprocess communication. Now, this is a Unix/Linux/Posix problem. Unix started out with almost no interprocess communication other than pipes, and has improved only slightly since. System V type IPC came and went. QNX type interprocess calls came and went. Mach type interprocess calls came and went. Now we have Android-type shared memory support.

Multiple threads on multiple CPUS in the same address space work fine, if you work in a language that supports it well. Go and Rust do; most other popular languages are terrible at it. They're either unsafe, or slow at locking, or both.

Partially shared memory multiprocessor are quite buildable but tough to program. The PS3's Cell worked that way. That was so hard to program that their games were a year late. The PS4 went back to a vanilla architecture. Some supercomputers use partially shared memory, but I'm not familiar with that space.

So those are the two big problems. So far, nobody has come up with a solution to them good enough to displace vanilla flat shared memory. Both require drastically different software, so there has to be a big improvement. We might see that from the machine learning community, which runs relatively simple code on huge amounts of data. But what they need looks more like a GPU-type engine with huge numbers of specialized compute units.

Related to this is the DLL problem. Originally, DLLs were just a way of storing shared code. But they turned into a kind of big object, with an API and state of their own. DLLs often ought to be in a different protection domain than the caller, but they rarely are. 32-bit x86 machines had hardware support, "call gates", for that sort of thing, but it was rarely used. Call gates and rings of protection have mostly died out.

That's sort of where we are in architecture. The only mainstream thing that's come along since big flat memory machines is the GPU.

ayende · on Feb 23, 2023

Just for refence. C and non flat memory model used to be REALLY common. It was called MS-DOS, and Win16 https://stackoverflow.com/questions/8727122/explain-the-diff...

Animats · on Feb 23, 2023

Few if any programs actually used segmented memory in MS-DOS as segmented memory. C compilers of the era combined both fields into 20-bit pointers. Pointer arithmetic was kind of messy but had compiler support.

atrettel · on Feb 23, 2023

> Some supercomputers use partially shared memory, but I'm not familiar with that space.

Some supercomputers do have some shared memory architecture, but a whole lot more use Message Passing Interface (MPI) for distributed memory architecture. Shared memory starts to make less sense when your data can be terabytes or larger in size. It is a lot more scalable to just avoid a shared memory architecture and assume a distributed memory one. It becomes easier to program assuming that each thread just does not have access to the entire data set and has to send data back and forth between threads (pass messages).

pinewurst · on Feb 23, 2023

As there’s nothing new under the sun, those call gates and rings came from Multics.

phkahler · on Feb 23, 2023

>> Multiple threads on multiple CPUS in the same address space work fine, if you work in a language that supports it well. Go and Rust do; most other popular languages are terrible at it.

I find OpenMP for C and C++ to be simple and effective. You do have to write functions that are safe to run in parallel, and Rust will help enforce that. But you can write pure function in C++ too and dropping a #pragma to use all your cores is trivial after that.

rcme · on Feb 23, 2023

So this author is (rightfully) getting a lot of hate. But, "What if we replaced POSIX?" is an interesting question to me. Most people interact with POSIX through the "good parts." But, once you need to write C, which happens when you need to make low-level calls to the OS, it starts to get a little annoying. The biggest annoyance is related to memory management. A lot of the OS APIs have manual free functions, e.g. you call `create_some_os_struct` and then `free_some_os_struct`. Similarly, you end up writing a lot of your own call/free functions. This is because you need to write C to talk to the OS, but your other code is probably not in C and may not have access to the same libc your C uses. So, you need to provide an escape hatch back into your C code to free any allocated memory.

Another annoyance is that passing data between C and another language is hard. For instance, if you want to pass a Swift string in to C, you need to be careful that Swift doesn't free the string while C is using it. The "solution" to this is to have explicit methods in Swift that take a closure which guarantee the data stays alive for the duration of that closure. On the C side, you need to copy the data so that Swift can free the string if you need to keep it longer than that one block. Going from C to Swift is also a pain.

A cool thought is: what if the OS provided better memory management? What if it had a type of higher level primitive so that memory could be retained across languages? For instance, if I pass a string from C to Go, why do I need to copy it on the Go side? Why can I not ask the OS to retain the memory for me? Perhaps we need retain / release instead of malloc and free. Anyway, just a random thought.

tsimionescu · on Feb 23, 2023

The problem with this thought is that malloc/free are not OS primitives, they are strictly concepts that make sense to your own program. Languages like Swift and Go never use these calls at all, for example. When Swift "frees" a string that was still being referenced from C, it's very likely not the OS that will mess with it, but other parts of the Swift program.

The way programs actually interact with the OS for memory allocation is using sbrk() or mmap() (or VirtualAlloc() in the case of Windows) to get a larger piece of memory, and then managing themselves at the process level.

And having the OS expose a more advanced memory management subsystem is a no-go in practice because each language has its own notions of what capabilites are needed.

drjasonharrison · on Feb 23, 2023

In my opinion, given that the C-ABI is pretty much the only cross language interface for transferring in memory data between components written in different languages, it is a pretty big blind spot not include into a new language a C-safe way of transferring memory.

Yes, you can open a socket or file and transfer your data that way, so someone had to then implement for the language the memory data protection mechanisms. Just expose them!

tsimionescu · on Feb 23, 2023

I don't think I suggested otherwise. I don't personally know of any non-esoteric language that doesn't support C interop - from Haskell to Common Lisp to Go to Java.

What I'm pointing out is that the kernel can't help with C interop in any way, since process memory is by-and-large not managed by the kernel - certainly not at the level of individual objects. The problem of sending a Swift string to C and making sure it doesn't get "freed" by Swift is that the Swift runtime might overwrite that string if it is "freed", even though it's still being used by the C code, not that the kernel might mess with it.

Even more, the risk when sharing memory with another language runtime is not even limitted to freeing. Many common languages have GCs which move in-use objects around in memory. This obviously can't work if that object was actually shared with a different runtime. In those, the object must not only be protected from being freed, but also from being moved around - hence the common concept of pinning it in memory.

I think this makes it even more obvious that the OS can't help with this task - unless we want the OS to standardize on a GC that it imposes on all programs, and essentially no one wants that.

anon291 · on Feb 23, 2023

So POSIX is both the libraries as well as the general system design. If you want to eschew all the POSIX libraries on most *NIX systems today (at least the open source ones), you can simply ... do that. In particular, the Linux kernel (and the BSDs are similar) make no assumption as to how you're managing user memory. You can call mmap to map pages and allocate memory as you like.

In fact, languages such as Go widely disregard libc (IIUC) and just roll their own thing. They still benefit from the POSIX semantics built in to the kernels that go programs run on.

At the end of the day, the main interface between POSIX kernels and userspace is a 32-bit integer (the file descriptor).

justin66 · on Feb 23, 2023

> So this author is (rightfully) getting a lot of hate.

Precisely what about that seems right to you?

rcme · on Feb 23, 2023

The letter isn't coherent, and I think people are reacting to that. The thesis of the letter is that "POSIX is the reason your code has annoying low-level plumbing and knobs needing attention," but doesn't explain how removing POSIX will help with that stuff. E.g. "new schedulers have to be built to handle the fact that memory is not all one thing." What does that even mean? Maybe he has a vision in his head, but it's not well articulated.

Also, the letter ends with this:

> If we are to write programs for such machines, it is imperative to get the Posix elephant off our necks and create systems that express in software the richness of modern hardware.

But such a system could still have annoying low-level knobs that need turning.

justin66 · on Feb 23, 2023

> The letter isn't coherent, and I think people are reacting to that.

I don't agree about the quality of the article, but it's the "hate" (your word) I was reacting to when I made my comment. Some of the reactions here are genuinely over the top. Add in the fact that the author of the ACM piece knows the POSIX subject matter a lot better than the people commenting, and it's just... classic late stage hn, I guess.

rcme · on Feb 23, 2023

I’m sure he does know POSIX better than most, but that doesn’t mean we should laud any drivel he produces.

I think people are reacting to the provocative idea combined with lack of substance in the response. But the letter itself was probably an off-the-cuff remark and wasn’t intended to be consumed and debated analytically.

jsjohns2 · on Feb 23, 2023

To those bashing the author as uninformed -- this is George V. Neville-Neil. Member of FreeBSD Core Team who wrote the book on FreeBSD. He might know a thing or two about POSIX! [1]

[1] https://www.amazon.com/Design-Implementation-FreeBSD-Operati...

scottlamb · on Feb 23, 2023

It's a bad article because it's too vague and doesn't clearly relate to the questioner's problem, not because the author doesn't have the proper pedigree.

One could certainly write good articles about why the POSIX API is too limiting. For example: the filesystem API is awful in many ways. I'll try to be a bit more specific (despite having only a few minutes to write this):

* AFAICT, it has very few documented guarantees. It doesn't say sector writes are atomic, which would be very useful [1]. (Or even that they are linear as described in that SQLite page, but the SQLite people assume it anyway, and they're cautious folks, so that's saying a lot.) And even the ones that I think its language guarantees, like fsync guaranteeing all previously written data to that file has reached permanent storage, systems such as Linux [2] and macoS have failed to provide. [3]

* It doesn't provide a good async API. io_uring is my first real hope for this but isn't in POSIX.

* IO operations are typically uninterruptible (with NFS while using a particular mount operation as a rare exception). Among other problems, it means that a process that accesses a bad sector will get stuck until reboot!

* It doesn't have a way to plumb through properties you'd want for a distributed filesystem, such as deadlines and trace ids.

* It provides just numeric error codes, when I'd like to get much richer stuff back. Lots of stuff in distributed filesystem cases. Even in local cases, something like how where in particular path traversal failed. I actually saw once (but can't find in a very quick search attempt) a library that attempted to explain POSIX errors after the fact, by doing a bunch of additional operations after the fact to narrow it down. Besides being inherently racy, it just shouldn't be necessary. We should get good error messages by default.

[1] https://www.sqlite.org/atomiccommit.html

[2] https://wiki.postgresql.org/wiki/Fsync_Errors

[3] https://developer.apple.com/library/archive/documentation/Sy...

drpixie · on Feb 23, 2023

It's a great article, and it raises many major issues with our current model of computing. But it's obviously triggering, and lots of people are rushing to defend their comfort zone.

Think outside the box people ... "files", what a charming but antiquated concept; "processes" and thus "IPC", how quaint!

jeroenhd · on Feb 23, 2023

He also worked on VxWorks, an operating system that is decidedly non-POSIX!

foxhill · on Feb 23, 2023

- a windows-first programmer who sees interoperability and composition as an encumbrance that satisfies “nerds who like to do weird shit that i don’t understand in bash”

reading from stdin isn’t challenging, nor writing to stdout. if someone can’t imagine why that might be useful, then i’d argue their journey as a software engineer is either at its end, or right at its beginning.

at the cost of potentially sounding inflammatory, “get good”.

jeroenhd · on Feb 23, 2023

> reading from stdin isn’t challenging, nor writing to stdout

Every time I dabble in C, I need to look up what method I need to use these days. getline? scanf? Do I need to allocate a buffer? What about freeing it, is it safe to do so from another thread? What about Unicode support, can I just use a char array or do I need a string library for proper support? What's a wchar_t again and why is it listed in this example I found online? How do I use strtok to parse a string again?

Sure, these things become trivial with experience, but they're not easy. Other languages make them easier so we know it can be done, yet the POSIX APIs insist on using the more difficult version of everything for the sake of compatibility and programmer choice.

(Modern) C++ makes the entire process easier but there are still archaic leftovers you need to deal with on nix if you want to interact with APIs outside what the C++ standard provides. At that point, you're back to POSIX APIs and nix magic file paths with *nix IOCTLs. Gone are your exceptions, your unique_ptrs, and your std::string, back are errno and pointers.

ddulaney · on Feb 23, 2023

> reading from stdin isn’t challenging, nor writing to stdout.

Unless you're Kernighan and Ritchie, who semi-famously wrote a buggy hello world program and used it to educate a generation of C programmers: https://blog.sunfishcode.online/bugs-in-hello-world/

Obviously an exaggeration, but when's the last time you checked the return value of printf? I know I don't. And that's not even a memory safety bug, just basic logic. I hope nobody trusts those guys around malloc and free :)

gjm11 · on Feb 23, 2023

The author, George V Neville-Neil, may be right or wrong but definitely isn't a Windows-first programmer with no understanding of Unix-like systems.

He cowrote a book about the innards of FreeBSD: https://www.oreilly.com/library/view/design-and-implementati... (ignore the first paragraph of the description, which is presumably a copy-and-paste error).

He has been on the FreeBSD Board of Directors: https://freebsdfoundation.org/blog/george-neville-neil-joins...

He's presented a bunch of papers at FreeBSD conferences: https://papers.freebsd.org/author/george-neville-neil/

All of which is perfectly compatible with his being incompetent, or wrong about this in particular. (For what it's worth, I don't think he is incompetent.) But what he easily demonstrably isn't is "Windows-first", and I suggest that any mental process that led you to that conclusion needs reexamining.

nixpulvis · on Feb 23, 2023

I'm confused as to what exactly is wrong with the notion of _jobs_ and _files_? Scheduling is hard, but modern operating systems are definitely setup to do it. I think we could probably do a better job of using realtime features and maintaining update latency benchmarks, but so many of the cycles on my PCs/mobiles are wasted doing god damned animations and updating the screen without interaction that I don't think this is really the main issue.

Programming is basically always just a matter of loading data, transforming data, and then putting it somewhere. The simplest record keeping systems do that, and the fanciest search algorithms do that. Decode/encode/repeat.

EDIT: The beauty of UNIX to me is the interoperability of the text stream. Small components working together. Darwinesque survival of the fittest command.

quanticle · on Feb 23, 2023

    The beauty of UNIX to me is the interoperability of the text stream

What interoperability? Look at the man page for any simple Unix utility (such as `ls`), and count up how many of the listed command line flags are there only to structure the text stream for some other program. "Plain text" is just as interoperable as plain binary. "Just use plain text" is the Original Sin of Unix.

The Unix Philosophy, as stated by Peter Salus [1] is

1. Write programs that do one thing and do it well

2. Write programs that work together

3. Write programs to handle text streams because text is a universal interface.

The problem is that, in practice, you can only pick two of those. If you want to write programs that work together, and do so using plain text, then, in addition to doing its ostensible task, each program is going to have to provide a facility to format its text for other programs, and have a parser to read the input that other programs provide, contradicting the dictum to "do one thing and do it well".

If you want programs that do one thing and do it well, and programs that work together, then you have to abandon "plain text", and enforce some kind of common data format that programs are required to read and output. It might be JSON. Or it might be some kind of binary format (like what PowerShell uses). But there has to be some kind of structure that allows programs to interchange data without each program having to deal with the M x N problem of having to deal with every other program's idiosyncratic "plain text" output format.

[1]: https://en.wikipedia.org/wiki/Unix_philosophy

infogulch · on Feb 23, 2023

Programmers tend to have a disproportionate affinity towards plain text, me included. But this is an intriguing argument so now I'm reconsidering. Maybe plain text is just someone else's unparsed junk.

tm-guimaraes · on Feb 23, 2023

Because plain text is narrow waist* that is easily debuggable.

* https://www.oilshell.org/blog/2022/02/diagrams.html

jimbo9991 · on Feb 23, 2023

This was a really insightful take on the Unix philosophy that I hadn't heard before but I intuitively agree with because of all the parsing code I've had to write.

chubot · on Feb 23, 2023

The M x N interoperability problem is SOLVED BY building on top of bytes / plain text, not solved by moving AWAY from it!

See this section of my follow-up post (to the link below, which is how I found this post):

https://www.oilshell.org/blog/2022/03/backlog-arch.html#slog...

There have been numerous projects which invent bespoke protocols for interoperability -- I give the examples of PowerShell, Elvish, and nushell.

(PowerShell doesn't use any kind of binary format AFAIK. I believe you are literally moving around .NET objects inside the CLR VM, and that representation is meaningless outside the CLR VM. This is crucial because it means that PowerShell must serialize its data structures for interoperability.)

As well as various Lisps.

The argument is how you interoperate between them. (Honest question -- please let me know.)

So ironically, trying to solve the interoperability problem in a smaller context CREATES it again in a bigger context (e.g. between different machines).

Bytes and text are fundamental because they reflect how disks and networks fundamentally work, in addition to operating system.

That does not mean we shouldn't have higher level layers on top of bytes and text, like JSON, HTML/XML, and TSV/CSV.

Those are structured data formats. You generally use parsing libraries for them, instead of writing the parser yourself.

Again, all of those formats ARE text, and that's a feature, not a bug!

quanticle · on Feb 23, 2023

    The M x N interoperability problem is SOLVED BY building on top of bytes /
    plain text, not solved by moving AWAY from it!

I never argued that we should move away from plain text. Indeed, one of the examples I cited of an interoperability format, JSON, does exactly that: it takes plain text and adds structure to it to make it easily parseable by machines. Overall, I'm not sure what part of my post you're arguing against. I'm suggesting that instead of having a many different ad-hoc formats, Unix utilities should agree on a few accepted serialization formats and pass information around using those. This would make our shell pipelines less complex and more robust because we wouldn't have to worry about e.g. random spaces or newlines breaking the ad-hoc parsers we write with `grep` and `cut`.

It seems like you agree with that, with the caveat that that serialization format should be built on top of plain text. That's fine. We can agree on JSON as a serialization format. It's not ideal, but it's better than the myriad of ad-hoc formats that we have now.

chubot · on Feb 25, 2023

Sorry if I came off as "excited" -- the main thing I'm not clear on is the framing of M x N with respect to parsing.

From my post:

I also claim that parsing is an O(M + N) problem, while types can create O(M × N) problems — and often do.

So the key point is that writing M + N amounts of code is tractable, but M x N is not. This was captured in the linked "Unix vs. Google" video as "coding the perimeter" vs. "coding the area".

This is not to say that SOME people won't be annoyed by writing M + N parsers! :) Or that writing parsers is easy. It's only claiming that it's tractable for global interoperability.

-----

Of course you could be referring to some different M x N issue, with respect to a specific application, but I don't really see it, and I'd claim it isn't the dominant issue with interoperability and POSIX.

To bring it back to the original subject, POSIX addresses the M applications x N hardware platforms issue (and again, the claim is that it's a compromise, not that it's optimal).

Text addresses the M kinds of application data x N operations issue: https://www.oilshell.org/blog/2022/02/diagrams.html#bytes-fl...

And of course POSIX APIs are text- / byte-oriented, unlike past and future operating systems.

It tames a code explosion and make systems feasible. But yes I definitely agree that we need more structure on top of text. My view is that we need a language with first-class support for serialized tables (TSV/CSV), records/objects (JSON), and documents (HTML/XML).

CSV in particular is very sloppy, and we should take at least as good care of our "languages for data" as we do languages for code (Go, Rust, Python, etc.)

Tar is also very sloppy, just re-read this post today: https://www.cyphar.com/blog/post/20190121-ociv2-images-i-tar

deafpolygon · on Feb 23, 2023

The only thing I really take away from the UNIX philosophy nowadays (I used to be a dyed in the wool fan of UNIX/Linux) is #1) do one thing and do it well. I see #2 as an ideal goal to reach but not always required. And #3 is nowadays untenable for me. If we can agree on an object exchange format (something PowerShell seem to have solved in part), then we can do much much more than relying on text streams.

quanticle · on Feb 23, 2023

    The only thing I really take away from the UNIX philosophy nowadays (I used
    to be a dyed in the wool fan of UNIX/Linux) is #1) do one thing and do it
    well. I see #2 as an ideal goal to reach but not always required.

If you have a number of programs, each of which does one thing and does it well, those programs will need to exchange data between themselves in order for the overall system to be useful to the user. To go back to the `ls` example I used in my post above: `ls` should just list files. Why should `ls` have anything to do with sorting, when `sort` exists? The reason, as it stands right now, is that `ls`'s plain text output is too much of a pain to parse, and so it's more convenient to build sorting into `ls` itself. If you start with "do one thing and do it well", but ignore interoperability, then your program will inevitably grow additional options and subcommands until it does one thing well, and quite a lot of things mediocrely. Instead of a collection of small sharp specialized tools, you'll end up with, e.g. `find`.

deafpolygon · on Feb 23, 2023

One thing well, can also be seen as listing directories in many different ways.

quanticle · on Feb 23, 2023

By that logic `perl` is a tool that does "one thing", where that "one thing" just happens to be "everything".

cvccvroomvroom · on Feb 23, 2023

Maybe it was written as a parody and we just don't get the joke? (Not POSIX, the article. POSIX survives fine outside of UNIX in embedded systems and somewhat on Mac OS and Windows.)

olliej · on Feb 23, 2023

macOS at least is definitely posix compliant - not just somewhat. It was a big part of what was needed for xserves to be competitive (alas being shiny and aluminium was not :) ). While xserves have died, macOS remains entirely posix compliant - complete with the horrific %n format specifier (although an environment variable is needed for it to be respected in non read only format strings iirc)

amiga386 · on Feb 23, 2023

It's arguable that macOS is actually POSIX compliant. I was frustrated that poll() simply refused to work on terminal devices in Mac OS X 10.1 and I had to use select() instead. And it's still not supported in macOS 13!

Can you claim POSIX compliance while actually not complying, by saying "BUG: currently doesn't comply" for more than 20 years?

https://developer.apple.com/library/archive/documentation/Sy...

> BUGS > The poll() system call currently does not support devices.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/p...

> The poll() function shall support regular files, terminal and pseudo-terminal devices, FIFOs, pipes, sockets and STREAMS-based files.

bregma · on Feb 23, 2023

Mac OS is not at this time a POSIX operating system [0].

[0] http://get.posixcertified.ieee.org/register.html

kevin_thibedeau · on Feb 23, 2023

%n isn't from POSIX. It has always been taken from the ANSI draft.

saagarjha · on Feb 23, 2023

There are no environment variables to disable this behavior.

astrange · on Feb 23, 2023

The idea that your computer would be faster or easier to use if it didn't have any animations seems untrue. Providing physicality is good! Helps you understand how things are changing between two different states.

nine_k · on Feb 23, 2023

Plan 9 enhances all of these good Unix traits. Even the universal text stream: it adds support for arrays / lists, that is, streams with elements larger that one byte.

shiftoutbox · on Feb 23, 2023

Gnn once told me , about how he had to work on plan9 . It’s an interesting topic . It’s good to see how other people think about this cs topic and see if you can borrow some ideas .

PaulDavisThe1st · on Feb 23, 2023

It is hard to believe that this bunch of drivel was actually available from acm.org. If a 16 year old programmer came to me with this nonsense, I might take the time to gently point them in a few directions. Being on the acm.org site ... unforgivable.

And not even the obvious faults, pointed out by others here. There's the question of an apparent complete ignorance of OS research, for example systems that rely on h/w memory protection and so can put all tasks (and threads) into a single address space. But what's the actual take home from such research? The take home is that these ideas have, generally speaking, not succeeded, and that to whatever extent they do get adopted, it is incremental and often very partial. If you don't understand why most computing devices today do not run kernels or applications that look anything like the design dreams of 1990-2010 (to pick an arbitrary period, but a useful one), then you really don't understand enough about computers to even write a useless article like this one.

Blackthorn · on Feb 23, 2023

That's a pretty bizarre answer to what was a pretty reasonable question. I don't even see how the question and answer are related honestly. Surely the question is more about finding a better storage format for the initial ingestion or data storage and has little to nothing to do with POSIX.

Most languages have a way to slurp a file into memory in a single function call, after all. The fact that files exist shouldn't be a barrier here.

UncleEntity · on Feb 23, 2023

They’re using python and C/C++ to speed up the slow bits. Can’t imagine anything more portable than that without even having to know what posix is doing under the library abstractions.

galaxyLogic · on Feb 23, 2023

When I was programming with Java-EE I was surprised how much effort had to be put in to produce the production artifacts: .Wars and .Ears and so on. I had to use a language called Ant. I also had to write Java, JavaScript, CSS and HTML.

On the C-side it is "make" or something similar. It means you must master multiple languages, some of which may be statically typed while others are not.

I assume integrating C with Python is similarly lots of overhead which in principle you shouldn't have to do. Why can't I just write everything in a single language and be done with it? Why do I have to write a program that transforms my source-code modules into an executable?

tmtvl · on Feb 23, 2023

And thus the Lisp machine was born.

thrownawaydad · on Feb 23, 2023

Sigh.

https://wiki.lesswrong.com/wiki/Chesterton%27s_Fence

drpixie · on Feb 23, 2023

Or perhaps https://en.wikipedia.org/wiki/Planck%27s_principle - science (or computing) changes one funeral at a time. Things change when the people wedded to old ideas are off the scene.

titzer · on Feb 23, 2023

In reality, it's more like this: https://en.wikipedia.org/wiki/Blind_men_and_an_elephant

Software is so huge that it would take you a lifetime of programming from different perspectives to get a grip on what it really is. So we are all doomed to experience POSIX through whatever programming experience we end up getting deep in.

I feel the underlying problem most software is that it's just too damn complex, so you can't fit enough of it in your head to design it how you think it should go. An average person can't go "Oh, I think the kernel should be able to do this" and then go whip it up and having an experiment running in a little bit. That's an esoteric corner full of tons of specialized and arcane knowledge that, truth be told, is completely invented. And half of that invention is workarounds for other bad inventions.

I dunno enough to just pronounce doom on POSIX, but I do feel like the rickety C way of doing things (everything centered around intricately-compiled machine code executables, incredibly dainty fragile eggshells that shatter and spill their entire guts of complexity on the world) underpins a ton of the problem.

The number of years you would need to just read, let alone grok all the hundreds of millions of lines of code that run on our system is just beyond human lifetimes now.

qwery · on Feb 23, 2023

Well that was far less controversial than the comments here suggested.

I read a call for innovation, the general thrust of which is based on the (obvious?) argument that if you build every new system to be compatible with the old one you limit the capabilities of the new.

Of course, the customer requesting that compatibility gets what they want -- an easier time building/porting software.

Alternate reading: ~POSIX (or just having a standard) is so useful that everyone wants it everywhere all of the time and I want something better.

drpixie · on Feb 23, 2023

Yay - I too want something better. It will be initially difficult, completely incompatible. But I hope that, someday, only computer historians will discusses "files" and "ports".

vitiral · on Feb 23, 2023

I'm not a huge fan of this article, but I do think:

- there could be better primitives for communication than a byte stream. Why can't I just send another process a chunk of memory?

- the posix byte stream API is bizarrely based on a flat buffer instead of a ring buffer, which is just the obviously wrong data structure IMO.

- Too much of the threading API is preempted threads, when cooperative scheduling (aka yielding) is so much easier to implement and understand.

The author doesn't really bring up concrete alternatives like the above though, so it's hard to know what they are railing against

PaulDavisThe1st · on Feb 23, 2023

> there could be better primitives for communication than a byte stream. Why can't I just send another process a chunk of memory?

shm APIs have made this possible for decades.

> the posix byte stream API is bizarrely based on a flat buffer instead of a ring buffer, which is just the obviously wrong data structure IMO.

so obviously wrong that the terabytes of existing POSIX-compatible code is broken, or limited, or something?

> Too much of the threading API is preempted threads, when cooperative scheduling (aka yielding) is so much easier to implement and understand.

if you want/need co-routines use them, and leave threads for the domains of programming where preemption is a critical part of the model.

ps. this sounds a bit more personally critical than I intend. I'm only trying to point out flaws that I see with these 3 points, not trying to suggest anything about you as the person who made them.

Cpoll · on Feb 23, 2023

> so obviously wrong that the terabytes of existing POSIX-compatible code is broken, or limited, or something?

You could say the same thing about the terabytes of existing Javascript code, and most people will agree that Javascript has more than its share of "obviously wrong".

vitiral · on Feb 23, 2023

> so obviously wrong that the terabytes of existing POSIX-compatible code is broken, or limited, or something?

it is just a really annoying API for a byte stream when what you typical want to do with a byte stream is:

* read some number of bytes (lines, a parser, etc), incrementing the ring tail as you go. * have the buffer asynchronously be filled, incrementing the (volatile) head as it goes

You might be able to just execute a single syscall "fill till eof" and the kernel would just keep your buffer full (through interrupts) without any other calls at all (although you would have to wait if the buffer became empty

Similar mechanisms could be done for interprocess communication.

anon291 · on Feb 23, 2023

> - there could be better primitives for communication than a byte stream. Why can't I just send another process a chunk of memory?

Isn't this just

write(fd, buffer, size_of_buffer)?

I'm not sure why everyone talks about byte streams being the basis here. If fd is a socket in SOCK_DGRAM mode, then buffer is received wholesale, never split. Bytestreams are not the fundamental abstraction. Files and sockets are.

> - the posix byte stream API is bizarrely based on a flat buffer instead of a ring buffer, which is just the obviously wrong data structure IMO.

Once again, this depends on the mode of the socket. you can put a Unix socket into a mode where it starts dropping packets when the queue is full.

> - Too much of the threading API is preempted threads, when cooperative scheduling (aka yielding) is so much easier to implement and understand.

Cooperative scheduling is part of POSIX. setcontext, makecontext, getcontext, and swapcontext.

vitiral · on Feb 25, 2023

That's not what this says SOCK_DGRAM is: https://www.ibm.com/docs/en/aix/7.1?topic=protocols-socket-t...

I don't know how it could be. In order to send chunks of memory you first need primitives of memory that can be sent.

loeg · on Feb 23, 2023

For the former, Linux has a concept of sealed memory backed files for this use case: https://lwn.net/Articles/593918/

vitiral · on Feb 27, 2023

Cool stuff, never seen that

nineteen999 · on Feb 23, 2023

It's time for these people to get off their high elephant and write or fund a substitute, instead of doing nothing and whinging constantly just for the sake of gathering imaginary internet points.

sophacles · on Feb 23, 2023

The author of this piece is a long time FreeBSD contributor[1]. Just maybe that person has relevant experience guiding their thoughts? He certainly is not "doing nothing and whinging constantly". Posix is not a religion - it's ok to look at other ideas, to listen to critiques. I mean Linux does it all the time - io_uring, all the ebpf stuff, a dozen forms of kernel bypass suggest to me that the posix interfaces aren't sufficient, or right for modern hardware, at least not in all cases.

[1] Here's his first commit: https://cgit.freebsd.org/src/commit/?id=026e67b69b612f90360a... with the most recent happening within the last year.

edit: fixed link

PaulDavisThe1st · on Feb 23, 2023

and yet somehow, he barely manages to mention a single non-POSIX OS API in the entire piece, either existing or posited. he writes as if we're living under the spell of POSIX ("the elephant") when it fact it is perfectly clear that we are slowly, cautiously, prudently incrementally adopting new models, some of which complement the existing POSIX API and some do not.

sophacles · on Feb 23, 2023

All the things I listed are where one OS has gone away from posix for high-end server use cases. For some systems I work on, those have actually simplified a lot of stuff - I mean they are complex apis solving hard problems, but it doesn't feel like fighting OS to use them.

The author is writing in context of people who use code very differently than I do - they think about it differently too. Perhaps there are different types of API for their use cases that are less "fighting with the OS" for them, or more likely the tool makers that build the intermediate code/frameworks?

PaulDavisThe1st · on Feb 23, 2023

> For some systems I work on, those have actually simplified a lot of stuff - I mean they are complex apis solving hard problems, but it doesn't feel like fighting OS to use them.

I am entirely in favor of domain-specific APIs that "make hard problems easier to solve". But it is important to recognize two things about this:

1. the "complex APIs" that are right for solving one set of hard problems are frequently not the right ones for a different set of hard problems.

2. POSIX never made any claim to universality in the domain of being the right API for a given problem class. As someone else noted here in the comments, it was a bunch of people & organizations simply agreeing on the low level stuff they could agree on.

I'd make one additional point: unless the plan is to reimplement the kernel for every problem domain, which seems pretty crazy, the actual situation faced by many/most real world programmers even in a better world is going to be "APIs we want to use layered on top of APIs we (mostly) don't want to use". This is necessarily so, because the APIs that make it possible to implement the APIs that whiners like the author of TFA want to use, along with the APIs that domain specialists want, are generally APIs like POSIX that people don't seem to warm to very strongly.

sophacles · on Feb 23, 2023

1. Yes, that's something i recognized explicitly above.

2. Sure, but that provided enough foundation for other ideas like epoll/kqueue to spread between OSes. It happened because everyone saw the strange non-posix thing (i think at Sun?) and made a similar system. Even though they aren't directly api compatible they were conceptually close enough, people ended up building nice layers on them (libuv, mio, etc). Maybe that would be enough.

Perhaps even, the right answer to several of the points raised is in fact io_uring like apis spreading to other OSes too, so that a reasonable cross platform abstraction layer can be built. Perhaps it's something different.

Perhaps is creating a new standard or a new posix version or addendum or whatever.

All of this aside, I'm not here to champion the specifics. I'm pointing out that it's a reasonable discussion to have, there's no need to be upset at someone for pointing it out.

nineteen999 · on Feb 23, 2023

I'm sure you're right, but your link returns the following for me:

  Bad object id: 026e67b69b612f90360a3c7b35be16f34bffd0c6c

sophacles · on Feb 23, 2023

Thanks for the heads up. Fixed in the original.

0cf8612b2e1e · on Feb 23, 2023

On the other hand, I see quite a few projects challenging The Way Things Are Done (Rust, NeoVim, Fish, OilShell, etc) to which there is a lot of kicking and screaming that things are fine. No need to change.

Alupis · on Feb 23, 2023

> I see quite a few projects challenging The Way Things Are Done (Rust, NeoVim, Fish, OilShell, etc) to which there is a lot of kicking and screaming that things are fine.

What you often see are half-baked, trivial re-implementations of some parts of existing utilities and then the proclamation "Rust is here and it's so much better!".

A program that amounts to some undergrad's semester project isn't going to be taken seriously - and they often are not. The existing utilities are decades old, are very good at what they do, and rarely are the cause of the supposed bugs we're trying to avoid.

Shell replacements, such as OilShell, can gain traction on their own, because a shell is mostly system agnostic. The struggle here is reaching critical mass of users, while also supporting all of the millions of existing shell scripts out there in the wild.

No one is going to make your new wiz-bang shell the system default on a popular distro without it being completely compatible with existing shell scripts, documentation, examples, etc. It's just a non-starter...

3np · on Feb 23, 2023

> No one is going to make your new wiz-bang shell the system default on a popular distro without it being completely compatible with existing shell scripts, documentation, examples, etc. It's just a non-starter...

False. zsh is not POSIX nor fully compatible with bash, while being default login shell in popular distros like Kali and Deepin (and macOS).

Not sure if Garuda Linux passes the "popular" threshold for you but their default is fish.

Shell scripts typically lead with a shebang, making your login shell inconsequential as long as you have the target shell available.

Alupis · on Feb 23, 2023

> False. zsh is not POSIX nor fully compatible with bash, while being default login shell in popular distros like Kali and Deepin (and macOS).

Kali is a purpose-built distro that is not a daily driver for nearly anyone.

Deepin is a Chinese distro, and while it may be popular to some, it is not a popular distro at large.

My point remains standing.

> Shell scripts typically lead with a shebang, making your login shell inconsequential as long as you have the target shell available.

Of course - but having multiple shells installed is not common. What's the point of using another shell if all of your scripts have to use bash anyway? You are still effectively using Bash, and have unintentionally exposed the problem with all non-POSIX compliant shells - they don't allow people to get work done.

3np · on Feb 24, 2023

> Deepin is a Chinese distro,

Excuse me, how is that in any way relevant here? Fuck off with that.

> but having multiple shells installed is not common

False again. Run 'chsh' and tell me how many options you see.

Alupis · on Feb 24, 2023

> Excuse me, how is that in any way relevant here? Fuck off with that.

What on earth are you talking about?

Deepin Technology is a wholly owned subsidiary of UnionTech, a Chinese company based in Wuhan China.

It is a Linux Distro developed in China for a domestic audience, ie. used predominantly in China by Chinese citizens. It is a Chinese Linux Distro...

It may be popular in China, but it is not globally popular. Very few people would willingly choose to run an operating system developed in China for domestic consumption, for very obvious reasons.

Pretend to be offended by something else...

> Run 'chsh' and tell me how many options you see.

Ya... only bash available on my system... a default installation of Rocky Linux.

Usually people try to ensure they're correct before saying someone is wrong.

nineteen999 · on Feb 24, 2023

> Ya... only bash available on my system... a default installation of Rocky Linux.

RockyLinux as you know is a RHEL clone. RHEL is a very conservative distro, their original market was refugees from Solaris and other commercial UNIX distros. And I say that as a diehard fan of Redhat/Fedora minimal installation as both a server and development platform.

Other less commercial distros (Debian and deriviatives, Arch, Gentoo etc) tend to offer more shell options out of the box, as do most of the BSD's. And other options are not hard to install from the Redhat/Fedora repos as you would know.

goodpoint · on Feb 23, 2023

> What you often see are half-baked, trivial re-implementations of some parts of existing utilities and then the proclamation "Rust is here and it's so much better!".

rust itself is an example. The concepts it's based on predate it by decades. When it comes to safety it's not as good as other languages and yet...

hhmc · on Feb 23, 2023

Even if neovim never gathers critical mass, it must be considered a success purely for how much of a fire it has lit under vim development.

deafpolygon · on Feb 23, 2023

One would argue that it's time to build something new. Just look at the massive popularity of VS Code - people clearly want the modularity and power of Vim but none of the cruft or baggage.

dylan604 · on Feb 23, 2023

Both can be true at the same time.

2h · on Feb 23, 2023

but not in the case. Things are not fine. I tried for years to bend over backward so that my code was POSIX compliant. At some point enough is enough.

mort96 · on Feb 23, 2023

You don't have to do that. If you're writing against Linux specifically, and your use case could make great use of Linux-specific features, use them! I've certainly done that in the past.

But I've also written a whole lot of software which really couldn't care less about the OS. It needs some way to list directories, some way to read and write files, maybe some way to communicate over the network, but everything interesting happens in code that's not related to interacting with the OS. In those cases, I love that I can just write against one API and know that, if it doesn't already just work on all posixy systems, it's at least going to be easy to port.

jen20 · on Feb 23, 2023

And this is why so many new tools are written in Go.

eyelidlessness · on Feb 23, 2023

Part of working on something is thinking it through. Part of working on something… as eminently social as interoperable systems with thousands of disparate components and many more thousands of contributors to some parts of that milieu… is socializing the thought process you’re taking to approach whatever part(s) of the Problems you’re trying to address.

Discussing these things earnestly and with considered and well reasoned inspection of what is, and why it is, and what could be… all of that is doing something. Maybe quite a lot in fact. Certainly it’s doing more than efforts like this, to quash or interrupt that public sharing of thought process.

“Doing” is an incredibly broad category of action. But it very seldom includes categorically dismissing ideas on the basis they haven’t already produced some other more concrete deliverable.

nineteen999 · on Feb 23, 2023

Don't disagree. I'm not dismissing the sentiment in general. But these kinds of articles have been posted here hundreds of times over the years, these thoughts are not new, and the fact that they fail to gain enough effective traction in the real world and are just posted here for brownie points is where it starts getting grating.

For everyone one person who posts one of these, there are 100 people cheering them on, and 10000 developers in the real world who just learnt to deal with POSIX and its warts years ago, just want a paycheck, and will wait until a better API gains traction in the real world before they go asking their employer to bet the farm on it.

eyelidlessness · on Feb 23, 2023

The thing I’m objecting to is framing the discussion as brownie points. For something or some things this complex, you either can’t or more likely should not proceed without a lot of talk. Here is a great place for productive conversation to start to form consensus. It can’t do that if planning and philosophizing and design and critique of ideas is precluded.

stonogo · on Feb 23, 2023

You're getting downvoted but you're not wrong. This article also ignores the fact that we had rich data-structured storage systems and transparent resource sharing, both built into the VMS system. POSIX-style systems absolutely destroyed the market for them, because at the end of the day they're relatively inflexible and "open, read, write, close" is sufficient to build whatever else you need.

Speaking as a working data scientist, even well-structured data usually needs heavy manipulation before it's suitable for use, and modern datasets absolutely do not fit into the system memory of a typical workstation. Secondary storage is now just an nvme cache for what we might as well call "tertiary storage," which is in the cloud, and just as unreliable as the old pallet full of tapes (not because the vendor is screwing up, but because there's a lot of network between here and there).

The problems haven't meaningfully changed in practice, they've just sort of shifted around so that different ones loom large. Unless you win the lottery and find a consistent, reliable source of clean data, POSIX isn't ever going to be the bottleneck.

EDIT: HN won't let me reply, so I'm editing this post in response to haunter's question below:

For the filesystem stuff, https://en.wikipedia.org/wiki/Files-11 and be sure to read the section about "Record Management Services" which sort of blurs the line between database and filesystem.

For resource sharing: https://en.wikipedia.org/wiki/VMScluster and remember that thanks to record-based file storage, locking can happen on the record level -- imagine instead of having to lock a whole JSON document for writing, you could lock just the key/value pair you were updating, and other programs could read/write the rest of the file safely!

spiralpolitik · on Feb 23, 2023

POSIX was largely an attempt to get everyone to agree on the bits that everyone agreed on, which turned out to be not a lot beyond the basics.

There was no political will to standardize the layers above POSIX and the standards that attempted to do that are dead and buried (in some cases for good reasons).

20 years later there still isn't the political will towards standardization.

The original article is correct in that our programming models do need to be updated, but I think it's a stretch to blame it on POSIX.

theamk · on Feb 23, 2023

> had rich data-structured storage systems

Don't know about VMS, but I've used "data-structured systems": old macs with resources forks, and Symbian with its databases.

Thank god they are gone.

I don't want to have to learn special archival tool just for one system. I want to be able to take any data file, and back it up, make a copy, send to a different system, encrypt, archive, checksum, rename, delete, compare...

If you want database instead of a file, there are plenty! A sqlite library running over POSIX kernel is superior to sqlite-like database in kernel, especially in terms of compatibility and features. Your "record level locking" is already possible, it's called "a database" and it is fine with POSIX.

PaulDavisThe1st · on Feb 23, 2023

> For resource sharing: https://en.wikipedia.org/wiki/VMScluster and remember that thanks to record-based file storage, locking can happen on the record level -- imagine instead of having to lock a whole JSON document for writing, you could lock just the key/value pair you were updating, and other programs could read/write the rest of the file safely!

That could exist right now.

But either you want think about the "plumbing" under the hood that makes that possible, or your don't. One way or another it has to be there. Even as hard as Apple has tried to pretend that their OS(es) don't have a BSD/POSIX/Mach kernel, most developers (certainly the ones who have to pay attention to cross-platform development) end up bumping into somewhere along the way.

> Secondary storage is now just an nvme cache for what we might as well call "tertiary storage," which is in the cloud

This might be true for some sort of computing, but it is absolutely not the general rule, and I would wager that it never will be. There's nothing you can do on the cloud that you cannot do on local storage except provide remote access more simply.

stonogo · on Feb 23, 2023

Every single smartphone on the market uses internet-based storage as tertiary storage. You can absolutely do data manipulation on these data stores independently of client device operation, and most do (photo resolution downgrades being the most common I am aware of).

And I agree about the plumbing, but this is a discussion around the environment of primitives commonly available, and I'm merely pointing out that richer environments than posix have already failed out of the market once.

PaulDavisThe1st · on Feb 23, 2023

I regret that my non-ownership of a SIM-enabled phone caused me to overlook the role "the cloud" plays for those computing devices.

stonogo · on Feb 23, 2023

Don't regret it! I think you probably made the right choice. Sadly, smartphones are probably the dominant computer use model nowadays.

haunter · on Feb 23, 2023

> we had rich data-structured storage systems and transparent resource sharing, both built into the VMS system. POSIX-style systems absolutely destroyed the market for them

Can you expand on that part? Sorry I’m not an expert but curious about this part of the OS history. Or is there an article (Wikipedia) about it?

mixmastamyk · on Feb 23, 2023

More accurately, it was cheap commodity hardware (PCs), and cheap software (open Unix source code, leading to Linux) that rapidly became powerful and "good enough" over time that eventually destroyed both the expensive workstation market as well as most very expensive high-end computing architectures.

For example VMS, there is an OpenVMS finally, but decades too late and still kept behind a gate, ensuring its demise.

Only mainframes survived from the previous era, probably due to their vastly different requirements and customers. Banks/govts for example, have no patience for "move fast and break things" and plenty of money to spare.

The "Worse is Better" piece describes the process from a design/implementation standpoint while the book the "Innovator's Dilemma" explains the process from an economics one.