> Calls which have no limitations, of which there are only a very few, for example zx_clock_get() and zx_nanosleep() may be called by any thread.
Having the clock be an ambient authority leaves the system open to easy timing attacks via implicit covert channels. I'm glad these kinds of timing attacks have gotten more attention with Spectre and Meltdown. Capability security folks have been pointing these out for decades.
> Calls which create new Objects but do not take a Handle, such as zx_event_create() and zx_channel_create(). Access to these (and limitations upon them) is controlled by the Job in which the calling Process is contained.
I'm hesitant to endorse any system calls with ambient authority, even if it's scoped by context like these. It's far too easy to introduce subtle vulnerabilities. For instance, these calls seem to permit a Confused Deputy attack as long as two processes are running in the same Job.
Other notes on the kernel:
* The focus on handles overall is good though. Some capability security lessons have finally seeped into common knowledge!
* I'm not sure why they went with C++. You shouldn't need dispatching or template metaprogramming in a microkernel, as code reuse is minimal since all primitives are supposed to be orthogonal to each other. That's the whole point of a microkernel. Shapiro learned this from building the the early versions of EROS in C++, then switching to C. C also has modelling and formal analysis tools, like Frama-C.
* I don't see any reification of scheduling as a handle or an object. Perhaps they haven't gotten that far.
Looks like they'll also support private namespacing ala Plan 9, which is great. I hope we can get a robust OS to replace existing antiquated systems with Google's resources. This looks like a good start.
C++ has far more to offer over C than just template metaprogramming.
Basic memory management and error handling, for example, are radically easier and less error prone in C++ than in C. Less reliance on macros and goto's should be pretty obvious wins.
There's really very little reason to ever use C over C++ with modern toolchains.
> Basic memory management and error handling, for example, are radically easier and less error prone in C++ than in C.
Microkernels don't need memory management. Dynamic memory management in a kernel is a denial of service attack vector. Fuschia is built on a microkernel, so I expect they will follow the property of every microkernel since the mid 90s: no dynamic memory allocation in the kernel, all memory needed is allocated at boot.
Furthermore, you don't want exceptions in kernel code. That carries huge and surprising runtime execution and space costs.
Simply put, there is no reason to choose C++ for a microkernel, and many, many reasons not to.
Of course they do. It takes memory to hold metadata about a process. It takes memory to hold resources about other services. It takes memory to pass data between them.
Just because that memory is reserved at boot doesn't mean it suddenly has no lifecycle of any kind.
> Furthermore, you don't want exceptions in kernel code.
Nobody said anything about C++ throw/catch exceptions.
> Simply put, there is no reason to choose C++ for a microkernel, and many, many reasons not to.
If you want to avoid C++ that's great, but to argue for C over it is insanity rooted in nostalgia.
> Just because that memory is reserved at boot doesn't mean it suddenly has no lifecycle of any kind.
Yes it does. The "lifecycle" is: allocate at boot, machine halts.
All of the memory you describe for other purposes is allocated at user level and booked to processes. This is how you make a kernel immune to DoS.
> Nobody said anything about C++ throw/catch exceptions.
That's the only meaningful difference in error handling between C and C++. Since you mentioned error handling as a reason to choose C++, what else could you possibly mean?
> If you want to avoid C++ that's great, but to argue for C over it is insanity rooted in nostalgia.
Sure, you keep believing that. It's clear you're not familiar with microkernel design. The advantages C++ has for application-level programming are useless in this domain.
Linear types don't "show" promise, they solve the issue, and this has been known since linear logic was popularized by Wadler[1] and Baker[2] in the early 1990s. The problem is that programming with linear logic is very inconvenient for a lot of things, and very inefficient for when you actually want to share data.
I understand RAII as resource management solution. What use does RAII have in error handling? It makes things convenient, but it does not make error handling go away.
No, but exceptions aren't the only way to handle errors in C++.
There are also library types that enforce checking for errors, something currently impossible in C.
Also thanks to its stronger type system, it is possible to do type driven programming thus preventing many errors to happen at all, which is also not possible in plain C.
Finally everyone is moving to C++, C for OS development is stuck on UNIX and embedded devs that wouldn't use anything else even at point gun.
> There are also library types that enforce checking for errors, something currently impossible in C.
This is a weak form of checking for a kernel. L4 Hazelnut was written in C++ for this reason, but they didn't use it much, mirroring Shapiro's experience with EROS. And when they had to revise the kernel design to plug security holes and they wanted to formally verify its properties, they switched to C because C++ was too complex and left too much behaviour unspecified, and thus we got verified seL4 written in C.
Only if we are speaking about enterprise CRUD apps that used to be done in MFC, OWL, VCL, Motif++.
OpenCL lost to CUDA because it did not natively supported C++, only when it was too late.
NVidia has designed Volta specifically to run CUDA C++ code.
There is no C left on game consoles SDKs or major middleware engines.
All major C compilers are written in C++.
Microsoft has replaced their C runtime library by one written in C++, exposing the entry points as extern "C".
Beyond the Linux kernel, all native parts on Android are written in C++.
The majority of deep learning APIs being used from Python, R and friends are written in C++.
Darwin uses a C++ subset on IO Kit, and Metal shaders are C++14.
AUTOSAR has updated their guidelines to use C++14 instead of C.
All major modern research OSes are being done in C++, like Genode.
Arduino Wiring and ARM mbed are written in C++.
As for Rust, while I do like it a lot, it still cannot compete with C++ in many key areas, like amount of supported hardware, GUI frameworks and available tooling.
> All of the memory you describe for other purposes is allocated at user level and booked to processes.
No, they aren't. A microkernel is responsibe for basic thread management and IPC. Both of which are highly dynamic in nature.
You seem to be confusing the system that decides when to make a scheduling decision (userspace process - although still part of the microkernel project, so still included in all this anyway), with the system that actually executes that decision (the microkernel itself). And in the case of systems like QNX the kernel will even do its own decisions independent of the scheduler service, such as switching the active thread on MsgSend.
But whether or not it's in ring0 or ring3 is independent of whether or not it's part of a microkernel. A microkernel delegates responsibility to ring3 processes, but those processes are part of the microkernel system - they are in fact a very critical aspect of any microkernel project, as without them you end up building a bootloader with aspirations of something bigger than a kernel.
> A microkernel delegates responsibility to ring3 processes, but those processes are part of the microkernel system
I disagree. Certainly you won't get a usable system without some core services, but the fact that you can replace these services with your own as long as you satisfy the protocol means there's a strong isolation boundary separating them from the kernel. Certainly they are essential components of the OS, just not the kernel.
As for the alleged dynamism of thread management and IPC, I don't see how it's relevant. There exist asynchronous/non-blocking IPC microkernel designs like VSTa and Minix in which the kernel allocates and manages storage for asynchronous message sends, but it's long since proven that such designs are hopelessly insecure. At the very least, it's trivial to DoS such a system.
Only bounded message sends with send/receive buffers provided by processes can you avoid this inevitability. If the idea with Fuchsia is to reimagine consumer operating systems, avoiding the same old mistakes seems like a good idea.
As for scheduling, that's typically part of the kernel logic, not a user space process. Yes, message sends can donate time slices/migrate threads, but there are priority inversion problems if you don't do this right, as L4 found out and avoided in the seL4 redesign. I honestly don't know why Google just didn't use or acquire seL4 for Fuchsia.
how about we argue the impossibility of most people ever being able to understand what's going on in C++ code (even their own code) and the cataclysmic consequences of using an over convoluted language? I mean there is a reason why the original pioneers of C don't use C++. (i mean other than the fact that dmr is dead)
On the other hand, large C code bases are a special kind of hell, lack of namespaces and user-defined types make it difficult to understand, modify and test.
> On the other hand, large C code bases are a special kind of hell, lack of namespaces
Can you please name a project that you have worked on where you have run into problems because everything was in a single namespace? What was the problem, how did you run into it, and how did you resolve it?
There are a lot of advantages to namespaces. I used to believe that single-namespace languages would cause problems for large software, but working with Emacs (huge single namespace with all the libraries loaded into memory at once, so much worse than C, where you only link a subset of libraries), this problem has not surfaced. I mean literally the only difference is that ".", or whatever the language-enforced namespace accessor is, goes from being special syntactically, to being a convention. When you start to think about namespaces as trees, this makes more sense. Namespaces just push naming conflicts to the parent node. There is no magic that is going to solve conflicts or structure things well or give things clear names. All that is up to the programmer.
I understand my code - the language doesn't dictate the understandability of the code that is written. Any language can be used to write indecipherable bad code. You are blaming the wrong thing. C++ seems to be very widely used to write some amazing things, despite your apparent hatred of it?
In my view C++ is a very complex language that only few people can write safely and productively.
When you say "I understand my code" I have to believe you. The problem is that understanding other people's C++ code takes ages, even if they don't abuse the language. Trusting their code is another story entirely.
C++ is a very flexible language in that it puts few restrictions on redefining the meaning of any particular syntactic expression.
That's great, but it also means that there is a lot of non-local information that you have to be aware of in order to understand what any particular piece of code actually does.
I'm not surprised that C++ is both loved and hated and perhaps even more often simply accepted as the only practical choice.
There aren't many widely used languages around that allow us to optimize our code almost without limit and at the same time provide powerful abstraction facilities.
At the same time, there aren't many widely used languages around that make reading other people's code as difficult as C++ (even well written code) and come with a comparably long tail of accumulated historical baggage.
Yes universal references take a while to understand. I read Scott Meyer's book and the chapter dedicated to it took some getting used to, and note taking.
The language is dealing with some tricky concepts. To hide them or try to gloss over them would lead to writing virtual machines and bloated memory usage etc. in the style of C# / Java.
How else would you deal with movement of variables and when an rvalue becomes an lvalue inside a function?
Most (I hesitate to say all) programmers understand their own code. The problem is usually that nobody else understands that code you wrote.
> Any language can be used to write indecipherable bad code. You are blaming the wrong thing.
Some languages allow stupid things. Some even encourage it. So, no, languages can and should be blamed.
I have to maintain other people's code, people who have left the company and not commented it. It is horrible to do, but it is possible. It's even better if they wrote it in a logical way.
So goto spaghetti is understandable? And dropping those isn't an argument since proper C++ usage also implies agreeing on a proper subset of the language to use. Modern C++ with sane restrictions is way more easy to understand. Especially w.r.t. resource ownership and lifetimes (as pointed out).
I'm not going to argue that one language is better than another but I do honestly get sick of all this "goto" bashing that often rears it's head. Like all programming constructs, goto can be ugly when it is misused. But there's times when I've simplified code and made it far more readable by stripping out multiple lines of structured code and replacing it with a single goto.
So if you're going to argue in favour of C++ with the caveat of good developer practices then you must also make the same caveat of C (ie you cannot play the "goto spaghetti" card) otherwise you're just intentially skewing your comparison to win a pointless internet argument.
No, I would never argue for C++. The reason being mostly its toolsets (constantly changing, instable and often incoherent). I just don't think readability is an argument - and I am as sick of (pointless) arguments against C++'s readability as you are about goto arguments :) Edit: Just to be clear - there are actual arguments against C's readability. For example when figuring out where and when data gets deleted - but as others have pointed out dynamic memory management is a whole different beast in kernel wonderland.
The rest of it could hide any number of dragons (and be written in any kind of legacy, nightmarish, and/or proprietary tools and languages), so it's not much of a proof of widespread bad C "goto" abuse.
Let's make a better criterion: how many of the top 200 C projects in GitHub suffer from "spaghetti goto" abuse? How many of all the C projects in GitHub?
Enterprise software is much more than just desktop CRUD applications.
For example, iOS applications, portable code between Android and iOS, distribution tracking, factory automation, life science devices, big data, graphics are all a small list of examples where C and C++ get used a lot.
Sometimes it says C++ on the tin, but when one opens it, it is actually the flavour I call "C with C++ compiler".
Github is not representative of enterprise code quality.
Your argument about enterprise code cannot be verified since we can't have access to it. Also, the sample of enterprise code you have access to is probably limited and thus most likely biased. Doesn't seem like a very good general argument, but maybe it is a good one for your own individual situation, if we are to believe your word.
You should say the same to coldtea, the person asserting that there are only 10 enterprise projects written in the C language and that there's no goto spaghetti in C language programs.
> If you want to avoid C++ that's great, but to argue for C over it is insanity rooted in nostalgia.
Did you know that code in C++ can run outside of main()?
I used to be a C++ believer, and advocated for C++ over our companies use of Java.
One day, they decided they wanted to "optimize" the build, by compiling and linking objects in alphabetical order. The compile and link worked great, the program crashed when it ran. I was brought in to figure it out.
It turned out to be the C++ "static order initialization fiasco":
If you've ever seen it, C++ crashes before main(). Why? Because ctors are getting run before main(), but before other dependent statics have been constructed.
Changing the linking order of the binary objects fixed it. Remember nothing else failed. No compiler or linker errors/warnings at the time, no nothing. But one was a valid C++ program and one was not.
You might think that is inflammatory, but I considered that behavior insane, because main() hadn't yet even run, and the program cored leaving me with trying to figure out what went wrong.
>> Furthermore, you don't want exceptions in kernel code.
>Nobody said anything about C++ throw/catch exceptions.
I'd like to add that if you're finding yourself restricting primary language features (e.g. templates, statics ctors, operator overloading, etc.) because the implementation of those features are bad, maybe using that language is the wrong choice for the project you're working on.
After I read the C++ FAQ lite [1] and the C++ FQA [2], I realized the determinism that C provides is kind of a beautiful thing. And yes. For a kernel, I'd argue C over C++ for that reason.
Well, if your main argument against C++ is undefined order of static initialization amd that it caught you by surprise, then I'd counter that by saying that you do not know the language very well. This is very well known behaviour.
I think that there are stronger arguments against C++: the continued presence of the complete C preprocessor restricting the effectiveness of automatic refactoring, the sometimes extremely cumbersome template syntax, SFINAE as a feature, no modules (yet!)...
Still, C++ hits a sweet spot between allowing nasty hardware-related programming hacks and useful abstractions in the program design.
I really do not have time to read a document like that to figure out whether or not that behavior is spelled out in the standard. So yes, I'll let you be the expert on this.
Sorry if the statement offended you. It came from the experience that I so far haven't encountered anyone who seriously uses C++ and does not know about the undefined order of static initialization. Also, I haven't yet had a situation where this was a big deal.
There are worse pitfalls than unstable order with static initializers specifically. If you dynamically load shared libraries at runtme on Linux, you risk static initializers being run multiple times for the same library. This is platform specific behavior that is AFAIK present on other UNIX systems as well and I'm certain that you won't find that in the standard.
> Sorry if the statement offended you. It came from the experience that I so far haven't encountered anyone who seriously uses C++ and does not know about the undefined order of static initialization.
Water under the bridge.
While I did say I was brought in to fix it, what I didn't say was that the group's management thought that Java coders could code in C++. D'oh.
Well, let me tell you that C suffers from the same issue of running code outside main().
It is funny how many issues people blame on C++ that are usually inherited from C semantics compatibility, or existing C extensions at the time C++ started to get adopted.
No no no no no. This is a C++ problem. As much as you want to blame this particular problem on C, C handles this the right way.
Let's try to write the equivalent of this error in C:
int myfunc() {
return 5;
}
static int x;
static int y = 4;
static int z = myfunc();
int main()
{};
Compiling that in gcc gives me:
main.c:8:16: error: initializer element is not constant
static int z = myfunc();
^~~~~~
And it makes sense, C just wants to do a memcpy() to initialize the static variables. In C++, the only way the class is initialized is if the ctor is run. And that means running the ctor before main().
Edited to add:
You're correct that 5.1.2 does not specify memcpy() as a form of initialization. But see my reply below about C11 and static storage classes.
Now try that with other C compilers as well, or without using static.
Also add common C extensions into the soup like constructor function attributes.
Finally ISO/IEC 9899:2011, section 5.1.2.
"All objects with static storage duration shall be initialized (set to their
initial values) before program startup. The manner and timing of such initialization are
otherwise unspecified. Program termination returns control to the execution
environment."
Don't mix what your compiler does, with what the standard requires.
Doing so only leads to unportable programs and misunderstandings.
The C11 standard is clear here on syntax for a static initializer.
Read section 6.7.9, constraint 4:
All the expressions in an initializer for an object that
has static or thread storage duration shall be constant
expressions or string literals.
It's syntax, not initialization.
And that makes sense. However the memory is initialized before runtime could be via memcpy() it could be loaded as part of the executable and then mapped dynamically at runtime. That's what 5.1.2 is saying.
What 6.7.9 constraint 4 is saying, is that static variables can only be constant expressions.
If you think C code doesn't run before main() you're very naive. Just try this:
#include <stdio.h>
static volatile int* x;
static int y = 42;
void __attribute__((constructor)) foo() {
printf("[foo] x = %p\n", x);
}
int main() {
x = &y;
printf("[main] x = %p\n", x);
return 0;
}
And before you complain about that being a compiler extension yes, it is, but it's also not rare, either, and you're probably using C libraries that do this.
>And before you complain about that being a compiler extension yes, it is, but it's also not rare, either, and you're probably using C libraries that do this.
e.g. All Linux kernel modules use this for initialising static structs for interfacing with the kernel.
It's a compiler "hack" for shared libraries, because there is no other way to run initialization for elf objects. [1] The C standard doesn't allow it. And gcc forces you to be explicit about it.
If one isn’t careful, you end up with interdependency between static initializers. Since the order of static initialization is undefined, you get fun bugs like your program crashing because a minor change caused the objects to link in a different order.
For example, the dreaded singleton using a static variable inside a function:
Having a couple of those referenced in static initializers is a recipe for disaster. It’s a bad practice, but people unfortunately do it all the time. Those that do this are equally unequipped to realize why their program suddenly started crashing after an innocuous change.
This made me think of segmentation faults caused by stack overflow due to allocating an array with too many elements on the stack, which is also "fun" to debug until you learn about that class of problems.
The Singleton pattern can be used to fix the order of static constructors. I think that this is the only reasonable use for the singleton pattern (which is just a global variable in disguise).
In my opinion, it's better to not rely on static constructors for anything non-trivial (singleton or not). They can be such a pain in the ass to debug.
I don't really like C++ and I haven't been forced to use it (with C and C++ you are basically forced to use them, few people use them for greenfield projects willingly, same for JavaScript; this of course doesn't apply for people who are already C/C++/JavaScript programmers), but from everything I've seen about modern C++ they are moving to a more consistent programming style.
Criticizing C++ in 2018 with arguments from back in 1993 feels dishonest.
"The problem that I have with them today is that... C++ is too
complicated. At the moment, it's impossible for me to write portable
code that I believe would work on lots of different systems, unless I
avoid all exotic features. Whenever the C++ language designers had two
competing ideas as to how they should solve some problem, they said
"OK, we'll do them both". So the language is too baroque for my taste.
But each user of C++ has a favorite subset, and that's fine."
In fact, they do so even better than they did back then. E.g. I eagerly anticipated C++11, but virtually every codebase that's older than three or four years and not a hobby project is now a mixture of modules that use C++11 features like unique_ptr and modules that don't. Debugging code without smart pointer semantics sucked, but debugging code that has both smart pointer semantics and raw pointers sucks even harder.
There's a huge chasm between how a language is standardized and how it's used in real life, in non-trivial projects that have to be maintained for years, or even decades.
I am currently on a team maintaining a giant codebase and migrating to C++11 (and beyond) for a new compiler. We do not have issues with the deprecation of auto_ptr, the use of raw pointers or general debugging COM problems. The code base is 20 years old and we do not complain to debug it.
Debugging pointers seems a poor reason to criticize an entire language!
C++ may be complicated but the English language is also complicated; just because people tend to use a smaller vocabulary than others doesn't make the language irrelevant or worthless.
Looking at how English has been used to create a raft of rich and diverse poetry, plays, prose and literature in general, the same should be applied to C++ because the unique use of it in a variety of varying circumstances surely is its beauty.
> Looking at how English has been used to create a raft of rich and diverse poetry, plays, prose and literature in general, the same should be applied to C++ because the unique use of it in a variety of varying circumstances surely is its beauty.
I don't think this is a valid argument, though. Natural languages have to be rich. Programming languages should be terse and concise because we have to keep most of them in our heads at one time and our brain capacity is limited. You don't need to know all of English/French/Romanian but you kind of need to know all of C++/Python/Javascript to do your job well when developing C++/Python/Javascript.
I think the C++ designers lately kind of agree with me but the backward compatibility requirements are really stringent and they can't just deprecate a lot of the older features.
That was obviously (I hope?) just one example. C++ has a huge set of overlapping features, some of which have been introduced as a better alternative of older features. Their interaction is extremely complex. It's great that your team manages to steer a large, old codebase without trouble, but most of the ones I've seen can't, and this complexity is part of why they can't.
Looking at contrieved legal texts, which is a better comparison with code than poetry, I don't agree. I don't even agree that there would be the english language.
Legalese uses a ton of latin ididoms, arcane rights and philosophies. This is comparable to the cruft of C or C++ standards. For a microkernel of some thousand LOC you shouldn't need a multi-paradigm language.
seL4 did it in Haskel, which is a step in the right direction. Then it was ported to a provably safe subset of C.
A large chunk of his argument doesn't hold at all. This:
"At the moment, it's impossible for me to write portable code that I believe would work on lots of different systems, unless I avoid all exotic features."
Is just not remotely true anymore. Modern toolchains entirely obsoleted that. Modern C++ compilers are nothing like what Knuth used in 1993.
If anything it's easier to write portable C++ than it is portable C due to C++'s STL increasingly covering much of the POSIX space these days.
“Criticizing C++ in 2018 with arguments from back in 1993 feels dishonest.“
That statement itself seems intellectually dishonest. What has changed that invalidates his arguments? After all, C++17 is still backwards compatible to the C++ of 1993.
Pardon me for finding this humorous, but stating that I can’t use a Donald Knuth quote in a computer science topic because it’s an old is like saying I can’t quote Sun Tzu when talking about modern events because the Art of War is an old book.
Donald Knuth is an amazing person, but I'm not sure he's necessarily the same authority in a discussion about industrial programming languages as he is in a discussion about computer science.
So to change your analogy, it would be like quoting Sun Tzu about the disadvantages of modern main battle tanks, using the Art of War. Sure, the principles in the Art of War are solid, but are we sure that they really apply to a discussion about Leopard 2 vs M1 Abrams?
That said, I'm not a fan of C++ either. I think its problems are intractable because they'd have to break backwards compatibility to clean the language and I'm not sure they can do that, unless they want to "Perl 6"-it (aka kill the language).
Fair enough, but I still wouldn't disregard Sun Tzu or Donald Knuth as making arguments comprised of "insanity rooted in nostalgia." That was my primary point.
In any event, Knuth specifically made statements dismissive of C++ 25 years ago that I believe are still valid today. I must have missed reading Sun Tzu's missive on mechanized warfare from the 6th century BC. ;)
Indeed, we can both agree on the backwards compatibility problem. I'm waiting on a C++ build as I type this. Also, I really like the new language features like std::unique_ptr std::function and lambdas.
I'd still rather do my true low-level programming in C bound with a garbage-collected higher-level language for less hardware-focused or performance-critical work instead of bolting those features on to C by committee over the span of decades. For example, C shared libraries with Lua bindings or LuaJIT FFI are bliss in my humble opinion.
I don't think it is a good blog post. He first criticises exception handling as undefined behavior which it is certainly not, and then criticises exception handling in general because it decouples error raising from error handling. This is whole point of exception handling because they should be used for non-local errors. Most of the "errors" handled in Martin's projects ZeroMQ and Nanomsg (which are both great libraries btw!) should not be handled as exceptions, as they are not unexpected values but rather states that have to be handled. Here, he uses the wrong tool for the job and criticises the tool.
He then criticises exceptions thrown in constructors and favors a init-function style. I never had any problem with this because I follow the rule that there shouldn't be much code in the constructor. The one and only task of a constructor is to establish the object's invariant. If that is not possible, then the object is not usable and the caller needs to react and shall not use the object.
In the second series, he first compares apples (intrusive containers) and oranges (non-intrusive containers), and then argues that the language forces him to design his software that way. Basically he argues that encapsulation makes it impossible in his case to write efficient code, and that you have to sacrifice it for performance.
However, with C++, you can extract the property of being an object in an intrusive list into a re-usable component, e.g. a mix-in, and then use your intrusive list with all other types. I can't do this in C in a type-safe manner, or I have to modify the structs to contain pointers, but why should they have anything to do with a container at all?
Besides that, I think that Martin is a greate programmer who did an amazing job with ZeroMQ. But I have the impression that he is wrong in this case.
No it's not, they confuse "undefined behavior" with "hard to analyse behavior" for starters. Exceptions are not UB, but the control flow is totally not obvious.
If I were to start a project today, I'd rely heavily on optional and result types and use exceptions only for serious errors, when it makes sense to unwind ans start from a clean slate.
What are you talking about... Why would you write a kernel in C++ instead of C? You want fine grained control over what the machine is doing. Imagine dealing with all that bullshit C++ comes with when trying to write a kernel. And then you’re trying to figure out if this C++ compiler you need for this architecture supports dark matter backwards recursive template types but it only supports up to C++ 76 and you’re just like fuck my life
"Orthodox C++" looks more like writing C-code and keeping the C programming style. Most of the points are really questionable.
- "In general case code should be readable to anyone who is familiar with C language". Why should it? C++ is a different language than C. I speak german, but I cannot read e.g. French unless I learned it even though they are related.
- "Don't use exceptions". Why? There is no performance penalty in the non-except case, and the exception case should be rare because it is exceptional. I can see arguments for real-time systems, and for embedded systems were code size matters. The alternative is C-style return codes and output parameters. Exceptions are better in that case because you cannot just go on after an error condition, and functions with output parameters are harder to reason about because they loose referential transparancy. Of course, in modern C++ one could use optional or expected types.
- "Don't use RTTI". I never needed RTTI in my professional life.
- "Don't use C++ runtime wrapper for C runtime includes". C++ wrappers have some benefits over the C headers. They put everything in namespace std, so you don't need to use stupid prefixes to prevent name clases, and they define overloads for some of the C functions, e.g. std::abs(int) and std::abs(long) instead of abs(int) and labs(long).
- "Don't use stream, use printf style functions instead". If this means to use a type-safe printf variant I could agree to some point, although custom operator(<<|>>) for custom types are sometimes nice. If it means to use C printf etc I would strongly object.
- "Don't use anything from STL that allocates memory, unless you don't care about memory management". You can use allocators to use e.g. pre-allocated storage. The STL also contains more than containers, why would you not use e.g. the algorithms and implement them yourself?
Well that's a shame! Unless we're missing something in the control path leading to these points, this kernel has trivially exploitable DoS problems. Creating threads doesn't require a capability, so at least the thread allocation is a clear DoS.
Let’s say I take your assertions at face value. Supposedly qualified, intelligent engineers who were no doubt aware of these points still decided C++ was an appropriate language to implement this microkernel.
> Supposedly qualified, intelligent engineers who were no doubt aware of these points
Are they aware of them? That's not so clear. There's a lot of literature in microkernel design. Lots of things have been tried which sound good but haven't worked out, and some things didn't work out well before would work well now. As usual, security also rarely gets the attention it deserves and it's doubly important at the kernel level.
That would be the principle of charity in action. I would hope those who disagree would approach disagreements the same way, and one or both of us will learn the truth. Should I be assuming people who disagree with me as idiots or evil?
The principle of charity would have you assume there do exist good reasons to write a microkernel in C++. Perhaps not ones that you would agree outweigh the downsides, but at the very least I suspect there’s some argument to be made in favor that isn’t simply staggering ignorance.
Wouldn't it be easy enough to assume that c++ has the necessary features, performance to do the job and the authors were presumably very familiar with c++ and decided existing expertise outweighed perceived advantages of other options?
> The principle of charity would have you assume there do exist good reasons to write a microkernel in C++
Except this was tried. More than once for different kernels (EROS and L4 at least), and they regretted it and then switched back to C. Both projects made many excellent arguments against C++ in a microkernel that haven't suddenly disappeared in modern C++.
So I think I am being charitable in this case because the weight of evidence suggests otherwise. It's charitable to assume that the developers are well meaning but aren't familiar with the history. This isn't staggering ignorance, just ordinary run of the mill ignorance.
Can you post a reference to the arguments? Most arguments against C++ are very dated, and sometimes are comming from people whose experience is mostly as a C programmer.
That might be tough for the Shapiro's argument. The EROS site is no longer available, the mailining lists are no longer available either, and citeseer's google results aren't working at the moment. Shapiro mentions a few issues in this paper which mirror the L34 arguments below [1].
For L4, there's brief mention here of the VFiasco project which attempted a verified L4 using C++, which failed despite considerable effort [2].
[3] is perhaps a better review of what worked and what didn't work in L4 research, and they explicitly discuss the issues, such as the fact that C++ conveyed no real advantages over C, the extra complexity of C++ made verification intractable (even for a subset), and practically, the availability of good C++ compilers for embedded systems was limited.
part of making a new operating system could be getting to muck with the language features ...
... i suppose this project is locked into G-standard C++ and with lots of good reasons (e.g. toolchain). although i'm an anti-C++ person, i suppose the sorts of tools available internal to Alphabet make it much more manageable.
anyway, for an example of the original premise: i'm slowly learning some things about plan 9 C via 9front.. there are some departures from ANSI C, including additional features like (completely undocumented, except for the relatively compact source) operator overloading. equally important for the type of person that finds C++ too busy, some features (e.g. certain preprocessor features) are removed.
I can see points against exceptions, but generally RAII has nothing to with exceptions, so what is the point against this?
It makes managing any resource with acquire/release semantics very easy. It also prevents errors when the code is modified later because you cannot forget to call cleanup code as it is done automatically. I have no experience with kernel programming, but acquire/release seems to be something that is done in the kernel.
> It makes managing any resource with acquire/release semantics very easy.
Agreed. But microkernels generally don't acquire or release resources. Most people arguing against this point seem to have a monolithic kernel mindset, but microkernels are a whole different beast.
If a kernel owns any kind of resources, that leaves the whole system vulnerable to denial of service attacks. Therefore, microkernels have long since adopted designs where the kernel does not own or allocate anything, and all resources belong to processes (which incidentally makes identifying misbehaving processes easy, something not easy on UNIX kernels).
Any data the kernel requires for its operation is allocated at boot time and lives until the system halts.
> There's really very little reason to ever use C over C++ with modern toolchains.
- most C code compiles magnitudes faster than most C++ code (unless you don't use templates or the C++ stdlib)
- you need C headers anyway for the public API, because the C++ ABI is not standardized (and a mess), trying to do shared libraries with C++ would result in such abominations like COM
- about half of features that C++ adds on top of C only make sense if you do OOP and encourage bad practices
- C is a much simpler and "complete" language, there's less room for debating about coding style, what C++ subset to use etc... (that's one thing that Go got right).
Eric S. Raymond's article is indeed interesting, but it doesn't contain a lot of real arguments. I find most of them to be anecdotes and they are not very convincing. The most convincing one is that people who are not proficient in C++ write code with horrible errors, and that is because the language contains so many subtle (and obvious) ways to shoot yourself in the head.
Most of the problems C++ has are coming from being backward-compatible with outdated language features, explicitly the C-subset. Even the problems with the toolchain are more less inherited from the C compile-link model and its use of the preprocessor as a "module" system.
If you use the language as intended, e.g. in the C++ core guidelines, you will se very nice language emerging which enables to write very efficient and elegant code, sometimes doing things that C cannot do, such as expression templates.
"If you use the language as intended, e.g. in the C++ core guidelines, you will se very nice language emerging which enables to write very efficient and elegant code, sometimes doing things that C cannot do, such as expression templates."
JavaScript is also an ongoing effort to extract and evolve a good working language out of a mass of features. It's obviously doable, but not easy, and there are a lot of problems in practice.
> > Calls which have no limitations, of which there are only a very few, for example zx_clock_get() and zx_nanosleep() may be called by any thread.
>
> Having the clock be an ambient authority leaves the system open to easy timing attacks via implicit covert channels. I'm glad these kinds of timing attacks have gotten more attention with Spectre and Meltdown. Capability security folks have been pointing these out for decades.
Is there a way that these could mediated by a capability without having to incur syscall overhead? One of the reasons that these are bare functions is likely that they are in the vDSO, and are just simple function calls which can access some shared memory which contains the clock time. I suppose you could simply not give some processes access to that memory, and have the functions in the vDSO just return an error in that case.
I know that there was a time when the Linux kernel changed how their vDSO handling worked, so older glibcs would have to fall back to making an actual syscall for gettimeofday, and that seriously affected performance on some servers that updated the kernel without updating glibc. These functions are called quite often on servers for logging purposes, so adding overhead to make them go through a syscall can be a big performance hit.
> I'm hesitant to endorse any system calls with ambient authority, even if it's scoped by context like these. It's far too easy to introduce subtle vulnerabilities. For instance, these calls seem to permit a Confused Deputy attack as long as two processes are running in the same Job.
Timing in particular is tricky. On Linux, you could use the vDSO for high-resolution timing but, from an attack perspective, it’s a red herring. Any serious attacker would use RDTSC, RDPMC, threads and shared memory, or some other hardware mechanism. On x86, RDTSC and RDPMC are controllable by the scheduler (there are bits to turn them off), but it doesn’t really fit in a capability model.
> you could use the vDSO for high-resolution timing but, from an attack perspective, it’s a red herring. Any serious attacker would use RDTSC,
Good point. I was going to say that in general the vDSO is just rdtsc + an offset applied. However they insert a barrier before it usually. That maybe or many not be helpful so I would probably still use rdtsc by itself.
> Is there a way that these could mediated by a capability without having to incur syscall overhead?
Is that even warranted? What applications can you imagine would make so many clock calls so as to incur noticeable overhead?
Your example of logging costs might work, but I'm very skeptical that user/kernel transition costs for a clock call would drown out the costs of writing the log entry to disk.
But to answer your question directly, the ability to access the clock in a shared memory segment can itself be reified as a handle that's granted to a process. The process would then issue a map operation and provide an address at which to map the clock (or you could just always map it at the same address too if that's preferable for some reason).
> What applications can you imagine would make so many clock calls so as to incur noticeable overhead?
Low latency processing or realtime-ish applications. If you have a budget of a few milliseconds only, lots of gettimeofday can start to add up. Now true that nowadays they are mostly vDSO so it doesn't matter as much. I remember before they were we saw a decent speedup when we upgraded to vDSO kernel version.
> But to answer your question directly, the ability to access the clock in a shared memory segment can itself be reified as a handle that's granted to a process.
rdtsc is not shared memory but more like a register read. Though it is a virtualized instruction so on proper VMs (not a container) it is possible to control what the guest sees as the value.
Indeed, but then access to a realtime clock factory should be reified as a handle. You would have to be given this handle and explicitly invoke it to install access to the clock if that's needed.
The point being, a handle should be involved at some point in order to make the access control explicit and not implicit and ambient.
I've got a customer that wants to get access to the PTP[1] registers of a NIC from user space. The customer suggests that context switches introduce too much latency and indeterminism (pre systems integration phase - lots of software from different teams ends up running on the final platform) to introduce a reference monitor (capability-based microkernel or otherwise) given known techniques on modern hardware. I'm afraid I can't share the specifics because I don't have them, but I trust the source.
I will grant that this case is an outlier, but this customer use-case is real enough to drive development dollars. It may not be the norm, but precision timing access appears to be very useful in some hard real-time contexts given current hw/sw realities.
Each log entry isn't necessarily written to disk one at a time. They will generally be buffered until enough have happened to need to flush. And they can be written compressed, and frequently have a lot of redundancy, so it can take a while before you accumulate enough data to need to flush to disk.
And besides logging, there are things like nanosleep or spinlocks which may briefly spin while querying the time before yielding.
I know of at least one CDN which was measurably impacted by the gettimeofday issue.
And yes, you could have the access to the address itself be something you are given access to via a capability/handle, but then the system call itself (which is actually a vDSO call) wouldn't have to actually take a handle as a parameter.
> And besides logging, there are things like nanosleep or spinlocks which may briefly spin while querying the time before yielding.
Another poster said something similar, but using the clock for this seems strange to me. Yielding a time slice for a certain number of ticks doesn't need access to the current time. I have less objection to an ambient yield since that's really an operation on your own schedule capability.
If you're after some kind of exponential backoff for a spinlock, that again doesn't seem to need the current time so much as a growing counter of the number of ticks to sleep.
> And yes, you could have the access to the address itself be something you are given access to via a capability/handle, but then the system call itself (which is actually a vDSO call) wouldn't have to actually take a handle as a parameter.
Correct, you'd use the handle to install an ambient clock in your environment. This leaves open the possibility that you can easily virtualize the clock by proxying the clock handle, ie. instead of the kernel updating your shared memory segment, it's another process.
The point being that reifying everything as a handle makes arbitrary virtualization patterns possible, but having ambient authorities all the way down to root makes it much more difficult.
>Another poster said something similar, but using the clock for this seems strange to me.
Does it matter if it seems strange to you? The GP gave real world examples of where this is an impact. I'm not sure why you are arguing against the design.
clock_gettime is extremely heavily used by nearly everything. Any form of work queue that supports delayed work, for example, is sitting on clock_gettime. Any form of media uses it heavily, as does self-monitoring to look for performance regressions in the wild.
You might be shielded from this depending on what level you're working at, but there's a reason that vDSO exists basically solely for clock_gettime, too.
Yes it allows for timing attacks, but you can't really avoid that, either, not without utterly crippling your platform.
Right, but your original statement was about clock_gettime, hence my confusion.
Something like sleep() is an operation on your own schedule, not an operation on a global system clock. That's not nearly as problematic.
Consider if you wanted to virtualize a process, say to deterministically replay it to trigger a fault or something. sleep(100 ticks) doesn't require additional kernel support for virtualization, but an ambient clock requires a lot of extra kernel support.
If the clock were only accessible via a handle, then you could proxy invocations on the handle without any extra support in the kernel. See my other comment for more details: https://news.ycombinator.com/item?id=16817462
You could use sleep, sure, but then you require a seperate thread for every delayed message, which is a whole different ball of not-fun.
Otherwise what actually happens when you do a postDelayed(func, delay) is it immediately is translated into a postAt(func, clock_gettime() + delay). Then as messages are consumed the work queue consistently knows how long it should wait to wake back up, no matter how many delayed tasks are in flight.
I feel like I'm missing some context. I thought you were talking about work stealing in threaded systems, in which case sleeping seems reasonable.
Are you instead talking about some kind of event loop? If so, then clock_gettime seems like it's merely convenient, not essential. You could just as easily keep a ticket incremented on successful operations or successive loops (or some other metric), and exponential backoff is a wait operation on a ticket number.
Unless you're suggesting the delay must be based on real time for some reason?
You're right that it's not essential to have, but if you can re-implement it then you've also just re-introduced the timing attack that the removal of clock_gettime was trying to prevent.
The trivial-ness with which a clock suitable for timing attacks is able to be created is literally why SharedArrayBuffer was panic-removed from all major browsers. Because that's all you need, shared memory & a worker. Congrats, you have a high-precision timer with zero OS support.
So as soon as you allow any form of threading or shared memory to occur you've given apps a high precision timer, so you might as well just give them an actual timer, too.
> You're right that it's not essential to have, but if you can re-implement it then you've also just re-introduced the timing attack that the removal of clock_gettime was trying to prevent.
I'm not convinced. The clock is global, shared among all processes, the ticket I suggested is local to a process, and so can't be used to signal between processes. Unless I'm again missing some context for what you mean.
You might want to go lookup the meltdown/spectre SharedArrayBuffer proof of concept attack. A timing attack just needs local timing to work, it doesn't care whatsoever about any time in any other process. That's just not how it works. All it needs is a stopwatch of any kind no matter how local.
> You might want to go lookup the meltdown/spectre SharedArrayBuffer proof of concept attack.
That still requires a piece shared state, like I said. That vulnerability results from running untrusted code in the same process as trusted code. The whole point of a process is to establish a protection boundary around potentially unsafe code, hence why I keep mentioning processes, and this is the whole point of microkernels and IPC. Within a process, all bets are off.
My earlier point was that a shared clock between processes amplifies this problem so that timing attacks cross even the process protection boundary. So when you started talking about job scheduling, I assumed the following:
1. we're in a microkernel context, where we partition trusted and untrusted code using processes.
2. the job scheduling system you mentioned either
a) is a process running its own code that it trusts and so in-process timing attacks don't matter, but timing attacks with another process might matter and so you don't want to grant a clock capability if it isn't needed, or
b) is a process scheduling system ala cron, where the job scheduler is trusted but the jobs being run are untrusted, and so they run in separate processes.
In case (a), the ticket system seems sufficient if you don't want/need to grant access to the clock consistent with least privilege, and for (b) the job scheduler may or may not have the clock installed or not, doesn't really matter, since job scheduling happens via IPC so there's no shared read, ie. delay(100 ticks, self) sends the relative delay which the cron-like scheduler adds to its own clock.
Hopefully that clarifies my context, and you can then describe what assumptions your scenario violates.
> My earlier point was that a shared clock between processes amplifies this problem so that timing attacks cross even the process protection boundary. So when you started talking about job scheduling, I assumed the following
It doesn't, though. Your point fundamentally misunderstands the nature of a timing attack for ex-filtration. It doesn't cross process boundaries. It doesn't use any process crossing state of any kind for timing. It simply times how long it takes to access cache lines, which is perfectly local & isolated. It does not involve time correletation or association across processes of any kind. It doesn't care about real time at all. All it needs is a stopwatch that can measure the difference between 0.5-2ns & 80-120ns.
The meltdown attack via SharedArrayBuffer did not use any untrusted code in the same process. It read kernel memory at will using 2 threads and an atomic int.
There's a few more related articles on his blog if you search, and I've read of it being implicated elsewhere, that's not to say the software calling that much was doing the right thing, just that it had impact.
NUMA systems might take 1 microsecond just to retrieve time internally from a shared hardware resource. The operation is not concurrent, so there's a chance for it to block for much longer.
This can happen because RDTSC is not synchronized between physical CPU sockets (NUMA regions).
> You shouldn't need dispatching or template metaprogramming in a microkernel, as code reuse is minimal since all primitives are supposed to be orthogonal to each other.
Orthogonality doesn’t obviate the need for metaprogramming though. For example, there are lots of places you might need a linked list structure, where the nodes store different types. You might use templates over intrusive structures to gain some type safety without duplicating code for each contained type.
That doesn't really seem super relevant, given that the parent comment seems to have been using template metaprogramming in the same way -- to just mean any template code.
> For example, there are lots of places you might need a linked list structure, where the nodes store different types.
Sure, but linked lists have solutions already using simple macros [1]. There aren't many sophisticated kernel data structures, mainly tables for indexing and linked lists, all of which have simple expressions as macros. Templates perhaps make this a little easier, but are overkill.
I'm not sure why that's problematic. It's a simple 1-liner that can be audited manually and reused arbitrarily. If you add it up, the TCB is actually less than relying on the complex template elaborator.
I'm a big static typing fan. That's not particularly relevant to this security consideration though, the TCB is the biggest factor. The C++ compiler and toolchain are a far larger TCB than the C toolchain and this one liner. As long as this reuse problem isn't a persistent pattern in a kernel, then the unsafety is less of a problem than introducing the larger TCB.
People make mistakes, even with simple one liners. Type checkers don't. Running the compiler is also a lot cheaper than having someone inspect every line of code carefully.
Sure they do, unless your type checker is formally verified. Are you suggesting there exists a formally verified C++ compiler?
> Running the compiler is also a lot cheaper than having someone inspect every line of code carefully.
Except that's not what I suggested. You only need to audit lines that perform potentially undefined behaviour. The C type checker ensures the rest are fine. I think it's clear this problem is worse in C++ given all of the additional abstractions that interact in surprising ways.
Furthermore, C++ has no formal verification tools, so if you wanted real guarantees, you'd use something like Frama-C or a theorem prover like they did with seL4. Microkernels in C++ have been tried, and in every case they switched back to C for very good reasons.
Are you suggesting that the error rate of type checkers, verified or not, is even in the same ballpark as the error rate of human reviewers?
If I care so much about system correctness that I do the insane amount of work that is formal verification (just look at how many man years sel4 consumed!), I don't write C or C++ at all, I use Spark. There is a reason why you can count the number of formally verified kernels on the fingers of one hand, and it's not because everybody tries to use C++ instead of C.
> Are you suggesting that the error rate of type checkers, verified or not, is even in the same ballpark as the error rate of human reviewers?
Nope, but your unqualified statement was simply false, and in any case, your point isn't really relevant. This single line of code we're discussing is easily verifiable by manual inspection and automated testing. We're not talking about huge swaths of code here, we're talking about a few unsafe primitives, because this kind of reuse isn't typical of a microkernel.
The type safety of a single operation expressed as a template is relatively insignificant compared to the added complexity of C++, particularly considering the "reuse" benefits are non-existent for microkernels, and if all you're left with is some elusive "type safety" for a linked list, that's simply not enough.
This is obvious when you know the history of multiple C++ kernel verification efforts, all of which failed, no matter how simple of a C++ subset was chosen. Many C verification efforts have succeeded, and tools for lightweight formal methods, like Frama-C, make the transition to verification possible with C. C++ is a dead end for this purpose, and the various L4 groups that created L4 kernels in C++ and then rewrote them in C all agree.
More evidence IMO that when people think they need template metaprogramming, what they probably really need is basic collections primitives. Go is a good example of this. (Strings, maps, lists, slices are just done once by the language, you don't get to write your own implementation because there's no generics, but in practice nobody cares.)
“lol no generics” is the meme. But the lack of generics is a real problem that bothers a lot of people (and several already developed their own solution to work around this problem)
People actually do care a lot about the fact that you can't have type safe and thread safe code, or abstractions without tons of synchronization code repeated ad nauseum throughout the code.
Multiply that by 10 if you have several junior team members who have been told, and read, that concurrency and parallelism is easy in go.
Sorting you can do -- the function takes indices. And no, I've never wanted to call .map() -- I love writing for loops over and over, it feels really productive.
> Having the clock be an ambient authority leaves the system open to easy timing attacks via implicit covert channels.
"leaves the system open" are strong words. If you allow only threads you have got high precision clocks as a bonus in practice. Still it could be useful to have environments so restricted that even restricting clock services could be useful, but I would certainly not call the lack of capability there a very big issue. Plus designing those would be hard anyway: does not having this capa would mandatorily prevent from even creating threads (or pretty much all kind of objects which can be retargeted to infer time, which means tons of them, and maybe it being the case or not will depend on the implementation)?
Maybe it is better to not pretend that a capability exists for getting the time if you will probably be able to get it without said capability?
Clock access via a handle has other uses beyond security. For instance, you could use it for deterministic replay of a process.
The point being, any abilities that aren't reified as explicit handles are ambient in the environment, and so make isolation and virtualization harder. Consider how you would virtualize the clock now that it's part of the process's implicit environment instead of an invocation made via a handle.
This is a good point. I've seen in practice systems where in hindsight it would have been nice if we had used some virtual clock rather than the machine clock.
An OS could help by making the virtualisable clock the default API. Still there will be applications that want a high-precision or low overhead clock. But those applications should buy into the trade-off explicitly.
Agreed! For instance, a realtime clock like you mention could be reified as a clock factory handle, which you must be explicitly given and must invoke to install the clock at a fixed or configurable address (or something along those lines). This is opt-in, efficient and virtualizable.
Thanks. None of the creator functions take the handles needed, which is contrary to capability security. This leaves the system vulnerable to DoS attacks at the very least.
Creation of those objects is also implicit on checking the current thread's access token and the current process's job (if present). So I think it has the same concerns that you outlined in an earlier comment about Fuchsia's creator functions.
There are also other issues I just remembered. Handles in a capability OS have no access control applied to them, ie. if you hold a handle, then you have permission to invoke operations on that handle. Attenuating authority involves deriving a new handle from an existing one with reduced rights, then passing that around.
I believe the NT kernel still checks permissions against ACLs for handles. This violates capability security, but you can build a capability OS on top of ACLs: http://www.webstart.com/jed/papers/Managing-Domains/
> Having the clock be an ambient authority leaves the system open to easy timing attacks via implicit covert channels. I'm glad these kinds of timing attacks have gotten more attention with Spectre and Meltdown. Capability security folks have been pointing these out for decades.
And the Spectre and Meltdown vulnerabilities have shown that it doesn't matter. Taking access to a clock or sleep mechanism away doesn't stop the bad guys but makes common programmers lives much more difficult.
> Taking access to a clock or sleep mechanism away doesn't stop the bad guys but makes common programmers lives much more difficult.
You don't take it away, you reify the clock as a capability/handle. This has two benefits:
1. Those vulnerabilities have shown that you can mount timing attacks even without a clock, but a clock amplifies the timing attacks you can mount. For instance, there remain attacks with a clock even if Spectre and Meltdown and related vulnerabilities are fixed.
2. There are software engineering benefits. For instance, in principle you can replace the clock handle with a proxy that you can use to provide your own times. This helps considerably with testing, deterministic replay, etc.
To people which don't understand the overall decision to create another system, I'll talk about at least one benefit to create a system that is not Linux: make software more simple and efficient. Do you really think that Linux is so great? Linux is a bloat system [1], POSIX is not so great as well (do you really read the WHOLE POSIX spec?).
It's important standards, it's important sometimes (SOMETIMES) compatibility. But not all this stuff defined in POSIX it's important. POSIX sucks sometimes [2], only GNU can be worse about being bloated [3].
Only users which don't touch in code can think that Linux, POSIX and GNU are entities following principles based in simplicity. Linux following Unix guidelines? This only can be a joke of Linus.
Creating custom software, maintaining and other stuff on things THAT YOU DON'T UNDERSTAND has a massive cost. As well, the cost to understand complex things, it's even worse.
Sometimes it's even more simple re-inventing the wheel than understand why a wheel was build with a fractal design [4].
A huge portion of Linux is drivers and support for different processor architectures. Yes, development was chaotic in the nineties and the code showed. But a lot of engineering effort went into making the core really nice.
With regards to POSIX, it is amazing how well this API is holding up. There are quite a few implementions from GNU, BSDs, Microsoft (at least partial support in MSVC) and a few others (e.g. musl). So POSIX support is a given on most systems. Why replace it with something that breaks existing code?
Not to say there is no bloat. But some bloat is the patina that all succesful systems take on over time. Is the bloat small enough to be managed and/or contained? I say yes.
> So POSIX support is a given on most systems. Why replace it with something that breaks existing code?
You're not necessarily breaking existing code. Both macOS and Windows are built on non-POSIX primitives that have POSIX compatibility layers.
It seems that the conclusion most of industry has reached is that, whether or not POSIX is a useful API for your average piece of software, there are still better base-layer semantics to architect your kernel, IPC mechanisms, etc. in terms of than the POSIX ones. You can always support a POSIX "flavor" or "branded zone" or "compatibility subsystem" or whatever you want to call it, to run other people's code, after you've written all your code against the nicer set of primitives.
An potentially-enlightening analogy: POSIX is like OpenGL. Why do people want Vulkan if OpenGL exists? Well, because Vulkan is a more flexible base-layer with better semantics for high-efficiency use-cases. And if you start with Vulkan, the OpenGL APIs can still be implemented (efficiently!) in terms of them; whereas if you start with an OpenGL-based graphics driver, you can't "get to" (efficient) Vulkan support from there.
All that aside, though, I would expect that the real argument is: Fuchsia is for ChromeOS. Google are happy to be the sole maintainers of ChromeOS's kernel and all of its system services, so why not rewrite them all to take advantage of better system-primitive semantics? And Google doesn't have to worry about what apps can run on a Fuchsia-based ChromeOS, because the two ways apps currently run on ChromeOS are "as web-apps in Chrome", or "as Linux ELF executables inside a Linux ABI (or now also Android ABI) sandbox." There is no "ChromeOS software" that needs to be ported to Fuchsia, other than Chrome itself, and the container daemon.
Total speculation: but I seriously doubt that Fuchsia is specifically for chromeOS. The whole point of decent, efficient, simple, non-bug-prone APIs is that you probably want to implement pretty much everything on it. Simplicity and low-overhead allow for generality and flexibility.
If all you wanted to do was support chromeOS - well, typically you can add hacks even to a messy codebase to support specific usecases. And there are a bunch of linux and ?BSD distros that demonstrate that you can adapt such a system to even very small devices; small enough that there's not much niche left below. Moore's Law/Denard scaling may be comatose on the high-end; but lot's of long-tail stuff is generations behind; which implies that even really low-power IoT stuff that linux is currently ill-suited for will likely be able to run linux without too many tradeoffs. I mean; the original raspberry pi was a 65nm chip@700MHz - that's clearly overkill; and even if chip development never has a breakthrough again, there's clearly a lot of room for those kind of devices to catch up, and a lot of "spare silicon" even in really tiny stuff once you get to small process nodes.
But "being able to run linux" doesn't mean it'll be ideal or easy. And efficiency may not be the only issue; security; cost; reliable low latency... there are a whole bunch of things where improvements may be possible.
I'm guessing Fuchsia is going to be worse than linux for ChromeOS - in the sense that if ChromeOS really was what google wants it for, they could have gotten better results with linux than they'll be able to get with Fuchsia in the next few years and at a fraction of the cost. Linux just isn't that bad; and a whole new OS including all the interop and user-space and re-education pain is a huge price to pay. But the thing is: if they take that route they may end up with a well tuned linux, but that's it.
So my bet is that you'd only ever invest in something like Fuchsia if you're in it for the long run. They're not doing this "for" ChromeOS, even if that may be the first high-profile usage. They're doing this to be enable future savings and quality increases for use cases they probably don't even know they have, yet. In essence: it's a gamble that might pay off in the long run, with some applicability in the medium term - but the medium term alone just doesn't warrant the investment (and risk).
I guess I left a bit too much implicit about my prediction on what Google's going to do: I have a strong suspicion that Google sees the Linux/POSIX basis of Android as an albatross around its neck. And ChromeOS—with its near-perfect app isolation from the underlying OS—seems to be a way of getting free of that.
ChromeOS has already gained the ability to run containerized Android apps; and is expecting to begin allowing developers to publish such containerized Android apps to the Chrome Web Store as ChromeOS apps. This means that Android apps will continue to run on ChromeOS, without depending on any of the architectural details of ChromeOS. Android-apps-on-Android prevent Android from getting away from legacy decisions (like being Linux-based); Android-apps-on-ChromeOS have no such effect.
I suspect that in the near term, you'll see Google introducing a Chrome Web Store for Android, allowing these containerized, CWS-packaged Android apps to be run on Android itself; and then, soon after that, deprecating the Play Store altogether in favor of the Chrome Web Store. At that point, all Android apps will actually "be" ChromeOS apps. Just, ones that contain Android object files.
At that point, Google can take a Fuchsia-based ChromeOS and put it on the more powerful mobile devices as "the new Android", where the Android apps will run through Linux ABI translation. But in this new Android (i.e. rebranded ChromeOS), you'll now also have the rest of the Chrome Web Store of apps available.
Google will, along with the "new Android", introduce a new "Android Native SDK" that uses the semantics of Fuchsia. Google will also build a Fuchsia ABI layer for Linux—to serve as a simulator for development, yes, but more importantly to allow people to install these new Fuchsia-SDK-based apps to run on their older Android devices. They'll run... if slowly.
Then, Google will wait a phone generation or two. Let the old Android devices rot away. Let people get mad as the apps written for the new SDK make their phones seem slow.
And then, after people are fed up, they'll just deprecate the old Android ABI on the Chrome Web Store, and require that all new (native) apps published to the CWS have to use the Fuchsia-based SDK.
And, two years after that, it'll begin to make sense again to run "the new Android" on low-end mobile devices, since now all the native apps in the CWS will be optimized for Fuchsia, which will—presumably—have better performance than native Android apps had on Android.
From a branding perspective, that would be terrible. They've already invested a bunch in Google Play brand that isn't Android Apps (Play Music, Play Books, etc).
Seems more likely they'll allow HTML apps into the Play Store, eventually getting rid of the Web Store entirely. They've already done the WebAPK stuff to glue HTML apps into Android.
If, as I suspect, they'd be willing to rename ChromeOS to be "just what Android is now" (like how Mac OS9 was succeeded by NeXTStep branded as Mac OSX), then I don't see why they wouldn't also be willing to rebrand the Chrome Web Store as "what the Google Play Store is now." Of course, they'd keep the music, books, etc.; those are just associated by name, not by backend or by team.
But they wouldn't keep the current content of the Play (Software) Store. The fact that every Android store—even including Google's own—are festering pits of malware and phishing attempts, is a sore spot for Google. And, given their "automated analysis first; hiring human analysts never (or only when legally mandated)" service scaling philosophy, they can't exactly fix it with manual curation. But they would dearly love to fix it.
Resetting the Android software catalogue entirely, with a new generation of "apps" consisting of only web-apps and much-more-heavily-containerized native apps (that can no longer do nearly the number of things to the OS that old native apps can do!) allows Google to move toward a more iOS-App-Store-like level of "preventing users from hurting themselves" without much effort on their part, and without the backlash they'd receive if they did so as an end unto itself. (Contrast: the backlash when Microsoft tried that in Windows 8 with an app store containing only Metro apps.)
I expect that the user experience would be that, on Fuchsia-based devices, you'd have to either click into a "More..." link in the CWS-branded-as-Play-Store, or even turn on some setting, to get access to the "legacy" Play Store, once they deprecate it. It'd still be there—goodness knows people would still need certain abandonware things from it, and be mad if it was just gone entirely; and it'd always need to stick around to serve the devices stuck on "old Android"—but it'd be rather out-of-the-way, with the New apps (of which old Chrome Apps from the CWS would likely be considered just as "new" as newly-published Fuchsia apps upon the store's launch) made front and centre.
> Seems more likely they'll allow HTML apps into the Play Store, eventually getting rid of the Web Store entirely.
I would agree if this was Apple we were talking about (who is of a "native apps uber alles" bent) but this is Google. Google want everyone to be making web-apps rather than native apps, because Google can (with enough cleverness repurposed from Chrome's renderer) spider and analyze web-apps, in a way it can't spider and analyze native apps. Android native apps are to Google as those "home-screen HTML5 bookmark apps" are to Apple: something they wish they could take back, because it really doesn't fit their modern business model.
XNU does embed BSD (and so POSIX) semantics into the kernel—some of which are their own efficient primitives, since there's no way to efficiently implement them in terms of Mach. But whatever BSD syscalls can be implemented kernel-side in terms of Mach primitives, are.
It has both. Mach system calls are negative, BSD system calls are positive. The BSD side has system calls for stuff like fork() that would otherwise be pretty clearly in Mach's domain.
Hard to imagine going to be used for ChromeOS before Android. Android runs native on Chromebooks because it shares a common kernel with Android so it can run in a container.
That would be lost which is a huge deal. The new gnu/Linux on ChromeOS would be fine as it runs on a VM and would still work.
Now they could move Android to using a VM but that it less efficient and most importantly takes more RAM and CBs do not normally have a ton of RAM.
> So POSIX support is a given on most systems. Why replace it with something that breaks existing code?
Because POSIX has horrible security properties, and does not provide enough guarantees to create truly robust software. See for instance, the recent article on how you simply cannot implement atomic file operations in POSIX -- sqlite has to jump through 1,000 hoops to get something pretty robust, but it shouldn't be this way.
You don't need async for a proper concurrent system. Systems have been concurrent before async IO. The trick is that when a process does an IO you yield it and put it on a wait list and run literally anything else until the kernel receives the OK from the hardware control and resumes the process.
Using Async IO in Linux merely means your specific thread won't be immediately suspended until you get data (or it's in a reasonably close cache)
It would be quite silly if Linux would wait for every IO synchronously, any single core system would immediately grind to a halt.
Linux offers some async io features, but does not offer async throughout.
In a fully async platform you would be able to do general-purpose programming without ever needing to use multithreading.
Example of a situation you can't do in linux: have a program doing select(2)
or equivalent on both keyboard input and network input, in a single thread.
Since linux does not support this, you are steered to adapt solutions that are more complicated than a pure async model would be,
* Spin constantly looking for activity. These heats up your computer and uses battery.
* Have short timeouts on epoll, and then glance for keyboard input. This leads to jerky IO.
* Have child processes block on these operations and use unix domain sockets to feed back to a multiplexor (fiddly, kernel contention).
* The child-process thing but with shmem (fiddly)
* Something equivalent to the child process thing, but with multiple threads in a single process. (fiddly)
You would think that x-windows might help out here. What if you had a socket to X, and then multiplexed on that, instead of looking for keyboard input from a terminal? This opens new issues: what if X has only written half of an event to your socket when select notifies you? Will your X library handle this without crashing?
Rurban's comment above is correct. Linux is not async throughout.
On OSs that offer kevent you can get a fair bit further, but (I believe) you
still can't do file creation/deletion asynchronously.
This is broken. (Woke from a sleep, face-palmed. I have been in Windows IPC land and got my wires crossed, sorry.) In linux you can select on both the stdin fd and network sockets in a single call. There is a way to get a fd for AIO also. AFAIK the sync-only file creation/deletion stands.
Linux is async throughout since it can do task switching on blocking.
You're not the only task running.
Or do you expect Linux to pause the system when you read a file until the DMA is answered?
Linux itself is fully async, anything blocking (even interrupts to some extend) will be scheduled to resume when the reason it blocked is gone or alternatively check back regularly to resume.
A program running on Linux can do a lot of things async as mentioned via POSIX AIO. It's not impossible, go seems to do fine on that front too (goroutines are put to sleep when doing blocking syscalls unless you do RawSyscall).
The conclusion that lack of 100% asynchronity means it can't be properly concurrent is also wrong. As evident that sending something to disk also doesn't not halt the kernel.
There are some Linux specific APIs that build on this and I've got the most experience with Linux so I was refering to that.
But Async IO is part of POSIX.
I was mostly responding to the comment which cited synchronous IO as a problem with POSIX. Linux is most widely used so on top of being my primary domain on the issue, it's going to ensure best understanding.
Is there an example of an os which doesn't have blocking io? To me it seems that blocking io will be needed at some level. You can of course put wrappers around it and present it as async. But in many applications you want to go as low as possible, and in my imagination blocking calls will be the lowest.
It's the other way around. Low level is all async everywhere, blocking is sort of just telling the kernel to wait for async op to complete before running the process.
The goal is to avoid blocking, not to offer async also. That's a lost cause.
Fuchsia once had the goal to offer async only, but then some manager decided that he needs some blocking, and then it was gone.
non-blocking only is lower level than blocking. You can always wait indefinitely for that callback, but usually you have a default timeout.
L4 for example offers that API.
> So POSIX support is a given on most systems. Why replace it with something that breaks existing code?
POSIX is a lowest common denominator API, its under specified and loosely interpreted. Which follows from its initial goals, which was basically to specify the bits common between various UNIX implementations. Those implementations obviously didn't want to change their implementation to match a rigid standard, so a lot of wiggle room exists in the standard.
The end result is that is pretty much useless for anything beyond "hello world" kinds of applications, both in terms of portability, as well as actual behavior (I could list a lot of cases, but lets leave that to google, with the starting idea to look at a couple posix API's, say close()'s errno's and the differing cases and what causes them on different OS's/filesystems). That is why there isn't a single OS out there that is _ONLY_ POSIX compliant. You need look no further than the 15 year old https://personal.opengroup.org/~ajosey/tr28-07-2003.txt and consider that the gap has widened as more performance or security oriented core API's have been introduced and linux's POSIX layer is further refined upon them.
Plus, the core API's in no way reflect the hard reality of modern hardware, leaving the standard even more under specified in the case of threads and async IO, which have been poorly bolted on.
Then there are all the bits everyone ignores, like the bits about the posix shell (ksh), and how certain utilities behave, while completely ignoring important things like determining metadata about the hardware one is running on.
> So POSIX support is a given on most systems. Why replace it with something that breaks existing code?
POSIX says that usernames and uids can overlap, and if a program takes either, it should accept the other a well. And if something could be a username or uid, it should be presumed to be a username, and resolved to a UID.
Now assume you have a user "1000" with uid 2000, and a user "2000" with uid 1000... you can see where this goes.
And this is why the BSDs have broken POSIX compatibility and require uids to be prefixed with # when they’re used on the CLI.
No, it does not say that. It says, specifically for certain standard utility programs such as chown and newgrp, that particular program arguments should always be considered to be user names if such user names exist even if they happen to be in the forms of numbers.
Nor is what you claim about the BSDs true. Aside from the fact that using the shell's comment character would be markedly inconvenient, especially since the colon is a far better choice for such a marker (c.f. Gerrit Pape's chpst), the OpenBSD and FreeBSD implementations of chown and newgrp do not have such a marker either documented or implemented, and in fact operate as the SUS says.
User names and user IDs are, by contrast, explicitly called out by the standard as strings and integers, two quite different things which have no overlap.
> But a lot of engineering effort went into making the core really nice.
I quite like Linux but I would not say the current state of the core kernel is "really" nice. Somewhat nice, maybe, but "really" nice? You just have to look around to see exemples of non-nice things (too much things in some headers, jumping between source files back and forth for no reason, excessive use of hand coded vtables, internal "framework" that are not as simple as they should, at other time lack of abstractions and excessive access to internals of structures, etc). Granted, it is way less buggy than some other softwares, but I think I'll know when I read a really nice core, and for now I've never had the feeling that Linux has one.
I recognize that Linux and especially GNU are bloated. However the Linux LOC overtime chart is useless at demonstrating it. AFAIK size increase of Linux is mainly caused by drivers. Device support is overall the most important thing OS offers. And it's hard to be small and elegant while supporting everything under the sun, which Linux can't even really achieve. One could say something about epoll instead of kqueue, ALSA instead of OSS or other similar cases. But the size itself doesn't say much.
I agree that there is much to be explored yet in OS design. However it needs deep pockets to actually support wide array of hardware. Or a long time.
I cheer for Toybox and Oil shell regarding the GNU userland.
When I see all the openat(2) and friends family of functions or chmod/fchmod, chdir/fchdir, barbaz/fbarbaz I'm thinking that it all needs a bit of a clean up. Some batching system call would be appreciated as after Meltdown syscalls are more expensive.
I personally like and keep my fingers crossed for DragonflyBSD. They are doing things that are innovative while keeping conservative core for the lack of better words. But at the same time DragonflyBSD has minuscule hardware support comparing to Linux.
I don't think changing the usage of system calls in the kernel as a reaction to Meltdown is a good workaround to make. This could lead to CPU makers relying on the workaround rather than fixing the issue in their CPU architecture. The other side of the medal is that this increases the motivation to switch to a newer CPU, which generates more waste, and the fact that CPUs running slower need to run longer, and use more energy, both of which is bad for the environment, and costs money.
Don't worry, by the time it is complete and mature, it will be complex and full of quirks too.
I'm not convinced that even Google has the resources to build something like that successfully. Except perhaps if they target a very specific use case.
Sponsoring Go introduction workshops for university undergrads, complete with Google-swag prizes and actual Google employees flown in from another country.
So, quite a lot if that experience is anything to go by.
I will say that my professor Axel Schreiner at RIT offered one of the two first Golang classes at a collegiate level back in... 2009? and he reached out to Google and said, "send us some Android phones so that we can develop Go on ARM"
They obliged with a full crate of first generation Motorola phones, each one preloaded with a 30 day free Verizon plan. Every person who took that class got one, and surely all of them made it back into the school's hands at the end of the quarter.
(I'm not sure how many people actually ever compiled and executed any go binaries on arm that year; we all learned Go, and it was a great class! But as far as the class, the phones were completely unnecessary. I think that they did make a more relevant class where the phones were able to be used again the year after that.)
> companies using the language with hopes of being acquired by them
This is beyond belief. What companies are using Go with the hopes of being acquired by Google? Does anyone honestly believe that Google's acquisitions teams know or care about programming languages? Any business that acquires companies on that basis is doomed to failure, as is any company that hopes to be acquired on that basis.
Go initially billed itself as a language for systems programming, but that claim was quickly retracted when it turned out that Go's creators had a different notion of systems programming than everyone else.
Not everyone. Just self-proclaimed authority who decided systems means operating system or some embedded code. Lots of companies I worked have title Systems Engineer or departments Systems engineering which has nothing to do with Operating systems but just some internal applications.
> Just self-proclaimed authority who decided systems means operating system or some embedded code
Systems does mean that. There are 2 broad categories of software you can write. One is software that provides a service to the user directly. That is an application. The other kind is software that provides a service to applications. That's systems software. Do you think there's something wrong with this notion? It's pretty well accepted over the decades:
Well by that definition Docker, Kubernetes, etcd and so on are systems software. But people here somehow explicitly make it to mean Computer Operating Systems.
This was all over with way before the 1.0 release even happened. There's no point in arguing over something that was addressed several years ago before Go even reached stability. Plus, trying to say that Go was a "failure" because of this is absurd. It was an issue of terminology, not technology. Given Kubernetes, Docker, etc you would have to be totally delusional to claim that Go has been a failure.
But Google as a whole isn't working on this. A lot of the projects at Google require collaboration across multiple teams, but I could see Fuschia being done by a fairly small and cohesive team.
I concur, macOS is a nightmare. I didn't think so before I had to write code for it, but I am currently writing code that runs on 5 OSs (Linux, Windows, macOS, Android and iOS) and macOS is by far the worst of all. In particular, everything related to filesystems was a nightmare, especially HFS+. Thankfully Apple is replacing it with the much more sane APFS.
I don't have much experience with them but I think VxWorks and QNX are good examples of simpler OSs. I do have some experience with Minix3 and it is certainly one. I guess the BSDs stand somewhere in the middle.
QNX always had practically oriented limitations of the general microkernel idea (eg. QNX native IPC/RPC is always synchronous) which allowed it to be essentially the only reasonably performant true micro kernel OS in the 90's. Unfortunately it seems that after QNX got repeatably bought by various entities the OS got various weird compatibility-with-who-knows-what hacks.
OS X has tons of weird quirks and compatibility mindfucks going back to their transition from OS 9 (all those .DS_Store and _filename files for "resource forks")
Backwards compatibility is #1 reason for increasing complexity. Apple is sometimes good in cutting away compatibility for the sake of cleaning up, but there are still weird issues poking now and then
hell the whole NSEverything is a compatibility thing with NextStep, which is long dead.
Apple doesn't really appeat to care about backward compatibility though. They break lots of things with every release of the OS. I would give that excuse to Microsoft, but not Apple.
Disagree. Microsoft keep making new APIs and depricating old ones. With Mac you got cocoa which goes back to 1989 and you can still use.
It might not be backwards compatible but from a developers perspective it is nice to be able to reuse old knowledge.
I think Apple has been much better than MS or Linux in continously modernizing and upgrading what they have. On windows and linux things tend to become dead ends as new flashy APIs appear.
Sure old win32 and motif apps might still run but nobody really develops using these APIs anymore.
On windows I first use win32, then MFC, then it was WinForms. Then all of that got depricated and we got WPF, silverlight and then I sort of lost track of what was going on. Meanwhile on Linux people used tcl/tk for GUIs early on. And there was motif, wxeindows. KDE and Gnome went through several full rewrites.
If we look at MacOS X as the modern version of NeXTSTEP the core technology has been remarkable stable.
Sure they have broken compatibility plenty of times but the principles and API are at their core the same.
I'm referencing the abilities of F500 companies to sustain OS development. OS X has roots in mach, and some BSD, but to call it either one is trivializing the amount of work that has gone on.
And NeXTStep itself is to large extent one big ugly hack that stems from experience of trying to build Unix on top of Mach. In fact it is not microkernel, but monolithic kernel running as one big Mach task, thus simply replacing user/kernel split with task/priviledged-task split (which to large extent is also true for OS X).
Correct. The Mach research project at Carnegie Mellon aimed to build a replacement kernel for BSD that supported distributed and parallel computing.
Next's VP of Software Engineering, Avie Tevanian, was one of the Mach project leads. Richard Rashid, who lead the Mach project ended up running Microsoft Research's worldwide operations.
Their work on a virtual memory subsystem got rolled back into BSD.
It did not. Mach is traditional microkernel which provides IPC mechanism, process isolation and (somewhat controversially) memory mapping primitives and not much else.
In late 80's/early 90's there were various projects that attemted to build unix on top of that as true micro kernel architecture with separate servers for each system service. Performance of such design was horrible and there are two things that resulted from that that are still somewhat relevant: running whole BSD/SysV kernel as Mach task, which today means OS X and Tru64 (at the time both systems had same origin as both are implementations of OSF Unix) and just ignoring the problem which is approach taken by GNU/Hurd.
The fuchsia part of android is linux? Anyway Google has a lot of people working for it. I don't think anyone would have expected NT to come from Microsoft at the time that it did.
You posted a video wherein the tester simply sequentially opens and closes a series of apps on a Samsung and Apple device seeing which will run through the sequence faster...and the Samsung was a bit slower.
In theory it makes me wonder if iphone's storage is slightly faster than the latest galaxy or if the process by which one loads an iphone app is slightly faster/more efficient than the one by which an android app is loaded or if the tiny selection of apps the reviewer picked are just better optimized for iphone. Nobody smoked anyone and nothing of note was learned by anyone. So much so that I wonder why you bothered to watch said link or paste it here.
I said half an OS because its built on technologies like linux and java not because its half assed even though it is.
>You posted a video wherein the tester simply sequentially opens and closes a series of apps on a Samsung and Apple device seeing which will run through the sequence faster...and the Samsung was a bit slower.
Certain apps were slower to load on the Samsung device. Additionally, the Samsung device encoded the 4K siginificantly video faster and took round 1 by 14 seconds and round 2 by 16 seconds.
>In theory it makes me wonder if iphone's storage is slightly faster than the latest galaxy or if the process by which one loads an iphone app is slightly faster/more efficient than the one by which an android app is loaded or if the tiny selection of apps the reviewer picked are just better optimized for iphone. Nobody smoked anyone and nothing of note was learned by anyone. So much so that I wonder why you bothered to watch said link or paste it here.
The iPhone X has faster storage and a significantly faster SoC. A 30 second win by the Samsung phone is what I could call getting smoked.
>I said half an OS because its built on technologies like linux and java not because its half assed even though it is.
And iOS is was a decedent of MacOS which itself is a decedent of NeXTSTEP. It also uses a language 11 years older than Java. So it sounds like iOS also meets your criteria of being "half assed".
> Sometimes it's even more simple re-inventing the wheel than understand why a wheel was build a fractal design
When I saw your comment I immediately was reminded of Joel on Software: "The single worst strategic mistake that any software company can make is to rewrite the code from scratch."[1]
Tips on surviving a rewrite in a mid-large sized company.
1) Get yourself placed on the "Legacy team" that is supposed to "minimally support the application while we transition to the new system".
2) Do whatever the hell you want with the code base (i.e., refactor as much as you want) because nobody cares about the boring legacy system.
3) Respond directly to user/customer needs without having to worry about what upper management wants (because they are distracted by the beautiful green-field rewrite).
4) Retain your job (and probably get a promotion) when they cancel the rewrite.
Alternatively, tips for not surviving a rewrite in a mid-large sized company.
1) Get stuck on the "Legacy team", as expensive contractors are called in to produce the rewritten version from scratch in an unrealistically small fraction of the time the original took.
2) Be told you can only fix bugs (with more layers of short term hacks), not waste time with refactors that have "no business value" for a code base that will be scrapped "soon".
3) Don't add any new features, to prevent the legacy system becoming a perpetually moving target that would delay the beautiful green-field rewrite even longer.
4) Hate your job, and watch your coworkers leave in disgust, further starving the team of resources, until you end up quitting too.
My last two IT jobs ever were both for businesses which cancelled huge rewrites after being sold to another company. Someone up top pitched the businesses as white elephants ripe for massive cost savings half-way during the rewrite.
The programmers with the political skills to get put in the "Rewrite team" will have had their jobs in the "Legacy team" protected in case of project cancellation. Or they will knife you in the back to get their old jobs back -- they will know about the cancellation and have plenty of time to maneuver before you know what's going on.
I think that actually speaks to Joel's point. None of those were rewrites.
Firefox started as a fresh UI reskin for the old browser interface, and indeed continued as "mostly a skin" for years (incidentally, so did Chrome).
(You can still get the non-firefox skin incidentally, it's "Mozilla Seamonkey")
Then the Rust rewrite. Rust was a hobby project, and they rewrote layout engine interface, and nothing more than that. Then CSS.
Now it's an HTML parser, a layout engine, a parallel CSS engine, and a GPU webpage renderer (still "missing"/non-rust are 2d compositing and javascript). Each of those components replaced another in a working version and there were at least beta versions of firefox of all the parts.
Potential is worthless. We have Rust, we don't have what might have been produced had Rust not happened, and as far as I know, no one is working on that hypothetical product.
It isn't, but in twenty years getting paid to write software I have far more regrets where I rewrote and shouldn't have, than where I should have rewritten and didn't.
If you're Google and you have people with these abilities kicking about, it's probably not a crazy investment to see what happens. We've got a HN story elsewhere in the list on post-quantum key agreement experiments in Chrome, again there's a fair chance this ends up going nowhere but if I was Google if throw a few resources at this just in case.
But on the whole I expect Fuchsia to quietly get deprecated while Linux lives on, even if there's lots to like about Fuchsia.
or run the entire business (when it works) without which the whole company would grind to a halt.
A rewrite made no sense to me since I'd end up maintaining version A alongside version B with B constantly lagging A unless I severely restricted the scope of B in which case it'd be an incomplete (though better written/more maintainable A).
Instead I went the isolate (not always easy), shim, rewrite, replace, remove shim approach.
It does feel a bit like spinning plates blindfold sometimes in the sense I'm always expect to hear a crash.
So far I've replaced the auth system, the reports generation system, refactored a chunk of the database, implemented an audit system, changed the language version, brought in proper dependency management, replaced a good chunk of the front end (jquery soup to Vue/Typescript), rewritten the software that controls two production units and implemented an API for that software so that it isn't calling directly into the production database.. and done it without any unplanned down time (though I'm still not sure how - mostly through extensive testing and spending a lot of time on planning each stage).
It's slower because I have to balance new features against refactor time but I have management buy-in and have kept it, mostly through been extremely clear about what I'm working on and what the benefits are and have built up some nice momentum in terms of deploying new stuff that fixes problems for users.
The really funny part is that even though I'm re-factoring ~40% of the time I'm deploying new features faster than previous dev who wrote the mess...because I spent the time fixing the foundations in places I knew I'd need for new features going forwards.
In my experience the second time leads to architecture astronautics... third time is when you get it right.
Althought in OS space one might argue that second generation of time sharing OSes (TOPS-10, ITS, MCP...) got more things right than are right in Unix and such.
For OS it means that you should pick one abstraction for process state and one abstraction for IO and in fact you can have same abstraction for both. In this view Plan 9 makes sense while modern Unix with files, sockets, AIO, signals, various pthread and IPC primitives and so on does not (not to mention the fact that on every practical POSIX implementation various such mechanisms are emulated in terms of other synchronisation mechanisms)
(the below is my ignorant understand of drivers in Fuschia and linux)
It's not just the kernel design, it's the driver design as well. Fuschia has drivers being ELF binaries that hook into Device Manager process[0]. I believe they are developing a standard API for how drivers interact with Fuschia.
Linux doesn't really have this today, which means that the driver must be in the kernel tree to have it keep up with the kernel's interface. And with ARM lacking something like a BIOS for phones, each chip that Qualcomm et al make requires a full BSP to get it working. And from what I understand, Linus and others don't want these (relatively) short-lived processors to be checked into mainline linux (plus you have the lag time of that code being available). Android's Project Treble[1] aims to address this some by creating a stable API for hardware makers to use to talk with Linux, but it is not an ideal solution.
There is device tree [0] to resolve the problem with chip support but I guess it would make "universal" drivers more complex and harder to maintain if all hardware features are to be supported (in an effective way). Seems like the Fuschia model might handle this better.
> Linus and others don't want these (relatively) short-lived processors to be checked into mainline linux
I don't think that's true. e.g. 4.17 is accepting code to run Samsung Galaxy S3 (released nearly 5 years ago). It is left to volunteers since most vendors dump a source release to comply with the GPL but themselves don't upstream it to Linus.
Because it can take 5 years, in which time the product is obsolete.
But, the core idea behind SBSA and SBBR is to form a common platform to which the vendors conform their machines so they don't have to keep up-streaming special little drivers for their special little SOCs. Only time will tell if its a success, but a large part of the community has effectively declared that they aren't really going to conform to the ideals of a standard platform. So, the ball keeps rolling and new DT properties keep getting added, and new pieces of core platform IP keep showing up. At this point arm64, despite just being a few years old already looks as crufty as much older architectures with regard to GIC versions, dozens of firmware interfaces, etc due to the lack of a coherent platform/firmware isolation strategy.
Ugh, tell me about it. Just dealing with a relatively simple situation where you've got signals coming in while reading a pipe is a hassle to get completely correct.
I am so there for an operating system API that is relatively simple and sane. Where writing correct programs is, if not the easy path, at least not the hard and obscure path.
I'd be down for an Erlang-like OS ABI (or, to put that another way, a Windows-GUI OS ABI, but for even non-GUI processes): just message-passing IPC all the way down. OS signals, disk IO, network packets, [capabilities on] allocated memory, [ACL'ed] file handles, etc: all just (possibly zero-copy) messages sitting in your process's inbox.
Of course, it's pretty hard to deal with a setup like that from plain C + libc + libpthread code, so OSes shy away from it. But if your OS also has OS-global sandboxed task-thread pools (like macOS's libdispatch)—and people are willing to use languages slightly higher-level than C where tasks built on those thread-pools are exposed as primitives—then it's not out of the question to write rather low-level code (i.e. code without any intermediating virtual-machine abstraction) that interacts with such a system.
QNX IPC mechanism is essentially cross-process function call, ie. message sender is always blocked while waiting for message reply and the server process cannot meaningfully combine waiting for message and some other event.
Edit: in essence QNX messages work like syscalls, with the difference that there are multiple "systems" that accept "syscalls". For what it's worth Solaris has quite similar IPC mechanism that is mostly unused.
I don't think that's what the parent was complaining about, it was more about handling signals correctly. It's incredibly easy to write buggy signal handlers on a Linux system. The "self-pipe" trick[1], and later, signalfd(2) have made signal handling much easier, but a lot of programs still do it the old way.
Type checking and calling help methods can be useful for debuggability! If you want to figure out what you're looking at in string format, call its .ToString method.
Adding extra complexity just means more to go wrong. Plain text can't really go wrong because anything can use it, anything can edit it, anything can show it. With "objects" I'm betting that Powershell itself never fails.
"Bloat" is a loaded term. It's meaningless to say "Look at all this code! It's bloated!" when you don't know the reason those lines are there, and comparing two "solutions" is meaningless when they don't solve all the same problems.
Mainly, I'm just getting tired of people trying to cut things out of the solution by cutting them out of the problem.
Saying that it's possible to solve a problem in a simpler fashion is fine. Saying a problem shouldn't be solved, that nobody should have that problem, so therefore we won't solve it, is not fine if you then turn around and compare your cut-down solution to the full solution.
Fuchsia is a microkernel architecture, so I think it being "more efficient" generally is not going to be the case. I do think it is valuable to see a microkernel architecture with large scale backing, as it simplifies security and isolation of subprocesses.
"More efficient" in terms of running LINPACK, maybe not. But the raw throughput of highly numeric scientific calculations isn't the goal of all architectures, even though we pretend it is.
It's possible to be more efficient at showing a bit of text and graphics which is what mobile phones do a lot more of than raw number crunching, except for games, except for games of course.
LINPACK would probably run equivalently. Anything that just needs the CPU will work about the same. It's overhead like networking/disk/display where microkernels lose out. Not saying that's overall a reason not to use, as the tradeoffs in terms of isolation/simplicity/security are an area very much worth investigating.
For networking and disk io monolithic kernel has to be pretty much completely bypassed already if you want high performance, see netmap/vale architecture for example.
Not sure about display though, but don't expect monolithic kernel to help here somehow either.
Userspace implementations of various protocols usually suffer from various problems, most notoriously that applications can't share an interface (how would you if both try to write ethernet frames at the same time?) and lackluster performance in low-throughput scenarios (high throughput != low latency and high packet throughput != high bandwidth througput)
GPUs don't have much security at all, there is lots of DMA or mapped memory. Though on most modern monolithic kernels a lot of this work is either in modules (AMDGPU on Linux is usually a module not compiled into the kernel) or even userspace (AMDGPU-Pro in this case). Mesa probably also counts.
Microkernels aren't the ideal kernel design. Monolithic isn't either. I put most of my bets on either Modular Kernels if CPUs can get more granual security (MILL CPU looks promising) or Hybrid Kernels like NT where some stuff runs in Ring 0 where it's beneficial and the res in userspace.
Of course they can share an interface, I even pointed out the vale switch as an example of this [1]. And it is very fast.
The thing is isolation and granularity that microkernels happen to have force certain design and implementation choices that benefit both performance and security on modern systems. And monolithic kernels while theoretically can be as fast and as secure actually discourage good designs.
It doesn't look like netmap is actual raw access to the interface like I mentioned.
I also severely doubt that microkernels encourage efficient design. I'll give you secure but it's not inherent to microkernels either (NT is a microkernel, somewhat, and has had lots of vuln's over the years, the difference between microkernels and monolithic or hybrids like NT is that most microkernels don't have enough exposure to even get a sensible comparison going)
IMO microkernels encourage inefficient designs as everything becomes IPC and all device drivers need to switch ring when they need to do something sensitive (like writing to an IO Port unless the kernel punches holes into ring 0 that definitely don't encourage security).
Monolithic kernels don't necessarily encourage security but definitely efficiency/performance. A kernel like Linux doesn't have to switch priv ring to do DMA to the harddisk and it can perform tasks entirely in one privilege level (esp. with Meltdown, switching ring is a performance sensitive operation unless you punch holes into security).
I don't think monolithic kernels encourage bad design. I think they are what people intuitively do when they write a kernel. Most of them then converge into hybrid or modular designs which offer the advantages of microkernels without the drawbacks.
You are assuming that switching priv ring is a bottleneck, which it isn't. The cost of the switch is constant and is easily amortizable, no matter the amount of stuff you have to process.
The cost of a switch is non-zero. For IPC you need to switch out the process running in the CPU, for Syscalls to drivers a microkernel will have to switch into priv ring, then out, wait for the driver, then back in and back out, as it switches context.
A monolithic, hybrid or modular kernel can significantly reduce this overhead while still being able to employ the same methods to amortize the cost that exists.
A microkernel is by nature incapable of being more efficient than a monolithic kernel. That is true as long as switching processes or going into priv has a non-zero cost.
The easy escape hatch is to allow a microkernel to run processes in priv ring and in the kernel address so the kernel doesn't have to switch out any page tables or switch privs any more than necessary while retaining the ability to somewhat control and isolate the module (with some PT trickery you can prevent the module from corrupting memory due to bugs or malware)
The reason a microkernel wouldn't be more efficient is that the OS is irrelevant for the (rather useless) LINPACK benchmark. However, I want a microkernel system and capabilities for HPC. The microkernel-ish system I used in the '80s for physics was pretty fast.
No, it won’t. This is not the user land you’re talking about and in general the idea that multiple, isolated processes can do better on the same CPU, versus a monolithic process that does shared memory concurrency is ... a myth ;-)
For throughput, separate processes on separate cores with loose synchronisation will do better than a monolith. You don't want to share memory, you want to hand it off to different stages of work.
Consider showing a webpage. You have a network stack, a graphics driver, and the threads of the actual browser process itself. It's substantially easier to about bottlenecking through one or more locks (for, say an open file table, or path lookup, etc) when the parts of the pipeline are more separated than a monolithic kernel.
> Lock free concurrency is typically via spinning and retrying, suboptimal when you have real contention.
Lock free concurrency is typically done by distributing the contention between multiple memory locations / actors, being wait free for the happy path at least. The simple compare-and-set schemes have limited utility.
Also actual lock implementations at the very least start by spinning and retrying, falling back to a scheme where the threads get put to sleep after a number of failed retries. More advanced schemes that do "optimistic locking" are available, for the cases in which you have no contention, but those have decreased performance in contention scenarios.
> Handing off means to stop using it and letting someone else use it. Only copy in rare cases.
You can't just let "someone else use it", because blocks of memory are usually managed by a single process. Transferring control of a block of memory to another process is a recipe for disaster.
Of course there are copy on write schemes, but note that they are managed by the kernel and they don't work in the presence of garbage collectors or more complicated memory pools, in essence the problem being that if you're not in charge of a memory location for its entire lifetime, then you can't optimize the access to it.
In other words, if you want to share data between processes, you have to stream it. And if those processes have to cooperate, then data has to be streamed via pipes.
> High performance applications get the kernel out of the way because it slows things down.
Not because the kernel itself is slow, but because system calls are. System calls are expensive because they lead to context switches, thrashing caches and introducing latency due to blocking on I/O. So the performance of the kernel has nothing to do with it.
You know what else introduces unnecessary context switches? Having multiple processes running in parallel, because in the context of a single process making use of multiple threads you can introduce scheduling schemes (aka cooperative multi-threading) that are optimal for your process.
System calls are not the reason the kernel is bypassed. The cost of the system calls is fixable. For example it is possible to batch them together into a single system call at the end of the event loop iteration or even share a ring buffer with the kernel and talk to the kernel the same way high performance apps talks to the nic. But the problem is that the kernel itself doesn't have high performance architecture, subsystems, drivers, io stacks, etc., so you can't get far using it and there is no point investing time into it. And it is this way, because monolithic kernel doesn't push developers into designing architecture and subsystems that talk to each other purely asynchronously with batching, instead crappy shared memory designs are adopted as they feel easier to monolithic developers, while in fact being both harder and slower to everyone.
You are mixing things up a little bit. Darwin (the underlying kernel layer of MacOS X and the rest) is actually a hybrid between a microkernel and a regular kernel. There is a microkernel there, but much of the services layered on top of it are done as a single kernel. All of that operating within one memory space. So some of the benifits from a pure microkenel are lost, but a whole lot of speed is gained.
So from a security standpoint MacOS X is mostly in the kenel camp, not the microkernel one.
According to Wikipedia - the XNU kernel for Darwin, the basis of macOS, iOS, watchOS, and tvOS is not a microkernel.
The project at Carnegie Mellon ran from 1985 to 1994, ending with Mach 3.0, which is a true microkernel. Mach was developed as a replacement for the kernel in the BSD version of Unix, so no new operating system would have to be designed around it. Experimental research on Mach appears to have ended, although Mach and its derivatives exist within a number of commercial operating systems. These include all using the XNU operating system kernel which incorporates an earlier, non-microkernel, Mach as a major component. The Mach virtual memory management system was also adopted in 4.4BSD by the BSD developers at CSRG,[2] and appears in modern BSD-derived Unix systems, such as FreeBSD.
This was, more or less, the driving philosophy behind BeOS. Therein lie some lessons for the prospective OS developer to consider.
Say what you will about how terrible POSIX is, Be recognized the tremendous value in supporting POSIX: being able to support the mounds and mounds of software written for POSIX. It chose to keep enough POSIX compatibility to make it possible to port many common UNIX shells and utilities over, while Be developers could focus on more interesting things.
So where were the problems?
One huge problem was drivers, particularly once BeOS broke into the Intel PC space. Its driver interface and model was pretty darn slick, but it was different, and vendors wouldn't support it (take the problems of a Linux developer getting a spec or reference implementation from a vendor and multiply). This cost Be, and its developer and user community, quite a bit of blood, sweat, and tears.
Another big problem was networking. Initially, socket FDs were not the same thing as filesystem FDs, which had a huge impact on the difficulty of porting networked software over to BeOS. Eventually, BeOS fixed this problem, as the lack of compatibility was causing major headaches.
The lesson: if you are looking to make an OS that will grow quickly, where you will not be forced to reinvent the wheel over and over and over again, compatibility is no small consideration.
Which points to what ended up happening with BeOS: it became an internet appliance OS, and then the core of a mobile product. These were areas where the hardware and application spaces were quite constrained, and BeOS's competitive advantage in size and performance could be leveraged.
Bloat is what you get when your software has to face the real world.
Yes the Linux kernel is bloated, bloated with tons of code responsible for making it work on exotic hardware. Yes x86 is bloated, let's just remove those useless AVX instructions. Yes MS Excel is bloated, who the hell is using all those features [1]?!
You can only have two alternatives : either your software is “bloated”, or then it will be replaced by something else, which is more “bloated” and works for more people.
Notice than I'm just criticizing the “bloat” argument, I'm just not criticizing Google for creating a new OS from scratch, which can bring a lot on the table, if done properly and includes innovations from the past 30 years, like was done when creating Rust for instance.
I honestly don't get what your 3rd reference is complaining about. That software has... more features and is faster (with the tradeoff being code size)?
LOC is not a good way to measure the "bloatness" of a software. There is a significant amount of device driver code in linux kernel. With the number of devices exponentially increasing, it is inevitable, but it does not make the kernel more complex.
A truck is not more complex than a car. A truck is bigger because it is designed to carry more load.
That article by Columbia claims that dbus is not POSIX, yet the communication occurs over a UNIX domain socket. I do not think that is a good example of not using POSIX for IPC. The underlying mechanism that makes it work is part of POSIX. It just extends what is available to provide IPC from user space in a way that people find more convenient.
A lot of people speculate that that's why Flutter was written in Dart instead of Kotlin or something else. Google wanted to use a language they already have a lot of investment in, and for some reason didn't pick Go. Which honestly seems odd to me since Go can be compiled all the way down to binaries already, they had to invent that compiler for dart, but whatever. Dart is super cool and I'm looking forward to using it in Flutter's user space.
The kernel was started before Rust 1.0, so I think that was a reasonable decision. Additionally, since it's a microkernel, the kernel is pretty small. That helps, both in the implementation now and if they ever decided to replace it. And future components can be in Rust if they want.
Bonus though, I'm pretty sure the fuschia user space has rust support already. I think their alternative to vim (xi maybe? I think it was.) is written in rust natively with python api bindings for plugin support.
Yes, xi (https://github.com/google/xi-editor) is a text editor written in Rust. It's not exactly an alternative to vim, it's more of a text editor microkernel that can have various different frontends and plugins attached which use JSON-based IPC to allow their implementation in whatever language you want. So, on macOS you can implement the frontend in Swift, and write plugins in Python or Ruby or Rust or whatever you like. On Fuchsia, the frontend is in Flutter.
Rust specifically because it has the zero-overhead safety properties via the borrow checker. This is something that no other safe language has, as far as I know. They generally either make you deal with a GC if you want safety, or deal with raw pointers and manual memory management if you want low overhead.
And the borrow checker, along with move semantics by default and the Send and Sync traits, help with several other aspects of safety as well; the Send and Sync traits encode information on what can safely be moved or shared between threads, and move semantics by default (and checked by the compiler, instead of at runtime as in C++), make it easier to encode state transitions on objects in ways that you can't try to perform operations on an invalid state.
But as others point out, Zircon, the Fuchsia kernel, was written before Rust 1.0 was released, and even after Rust 1.0 was released and stable it was still a bit rough to work with for a little while.
If you were starting a new project from scratch today, I'd seriously ask why not Rust, though of course there are other reasons why you might not choose it. But given the history, and how new Rust was at the time when this project was started, it makes a lot of sense.
>If you were starting a new project from scratch today, I'd seriously ask why not Rust, though of course there are other reasons why you might not choose it. But given the history, and how new Rust was at the time when this project was started, it makes a lot of sense.
While I'm not saying that rust is bad, Rust as a kernel language won't help you here. The simple proof is that Redox-os (which is a microkernel built in Rust) was found with the same class of bugs your average OS had.
> The simple proof is that Redox-os (which is a microkernel built in Rust) was found with the same class of bugs your average OS had.
Source? I haven't heard about these vulnerabilities.
In a kernel, there are several places where Rust won't help you. There are certain things you have to do that are going to need unsafe code, and need to be audited just as much as code written in any other language. The kernel is also a place ripe for logical vulnerabilities to creep in; it is the main arbiter of what is allowed and is not allowed from processes, so many possible vulnerabilities are logical, not memory safety issues.
On the other hand, when a kernel gets complex enough, there are places where memory safety or thread safety issues can come up even in code which doesn't require particular unsafe features and which is not involved in a security role; just things like manipulating complex data structures in multiple threads. This kind of code is the kind of code in which Rust can help catch issues; and it definitely still happens even when very good programmers, with good code review, work on the kernel.
Rust is not a panacea; it is not a cure-all. But it reduces the attack surface considerably, by containing the amount of code that is exposed to certain very common classes of bugs.
Despite the progress made over the last 24 months, I don't think Rust is ready for embedded primetime. Not to mention, the Rust learning curve is VERY steep for these guys coming from C.
Edit: to clarify ... I don't mean that they won't be able to understand the concepts ... I mean that they will lose a lot of productivity up front and won't appreciate the backend gains enough for there to be a political consensus to switch.
Why? The "Better C" subset of D is not memory safe, which is one of the main reasons for suggestion Rust over C++ as an implementation language. What does D offer you over Rust in this case? Better metaprogramming facilities? Is that something you really want to rely on significantly in a microkernel?
It’s almost if humans have a habit of making big things and moving on after enough of the voices behind big thing die and we all forget the original reasons
Fuschia will become another monolith where the assumptions no longer matter given tech of 2035 or whatever, and something new will happen
IMO the better model would be a simple hardware BIOS type thing with a standardized build system embedded that enables completely custom environment builds from whatever source repo people want
These big monoliths seemed more valuable in a pre-internet is everywhere world, so they include the kitchen sink too
Fuchsia is not a monolith in a design sense, though -- it's a microkernel, and the device interface is based on IPC, which means you can keep the kernel clean and simple, and put drivers elsewhere. For comparison, the majority of the Linux source tree is drivers and file systems. The kernel proper is pretty small. But because it's a monolith, it all has to live and evolve together.
I’m talking to the commentary about a dated cultural artifact sucking, championing of new hotness, a new culture that will develop around it and eventually fade as technology changes and the peddlers of the latest and greatest will largely say the same “wtf @ this mess”
But downvote reality of human progress that has occurred over and over
The opinionated nature of humans is bizarre, given that we keep, in a social way, repeating our dead relatives
We can fix the future from a past that won’t live to see it
It’s all nonsense for the majority of humans, but this little microcosm called HN has its finger in the pulse of what we’ll need in 2040!
I see a lot of negative comments about this project here. Let me just say that: it doesn't need to be a POSIX compliant system, doesn't need to be user friendly or even provide something different from what can already be done with Linux or other OS's that we have today.
Google spends a lot of money on research. One thing about research is that a lot of stuff you do ends up completely useless in the short term even if you cover all your bases initially. Even if this project fails, I hope something good can be learned from why it failed; maybe someone in the future can learn from those mistakes and try again.
I'm certainly no fan of Google nor of the way they make money. But I am very happy they use that money for stuff like this.
Hacker News commenters are not the ones making baseless claims about how their product is better than the current market dominator. Hitchens Razor is working just fine in these comments.
It reminds me of NFL fans. No one talks trash about the crappy quarterbacks of their opponents teams. Everyone talks trash about Tom Brady, and Peyton Manning, Cam Newton, etc.
This is also a general negotiating tactic. Start with a bold initial offer, and then negotiate back to what you wanted in the first place.
> Hacker News commenters are not the ones making baseless claims about how their product is better than the current market dominator.
Where did anyone say that Fuchsia was better than Linux?
The title of this document, "Fuchsia is not Linux", is a play on the "GNU's not Unix" backronym for the GNU project, as well as being a way of pointing out that unlike Android, Fuchsia is an entirely different kernel.
I mean, obviously they would write it because they thought it would be better for certain applications than Linux, or better for trying out new ideas, or the like, but I don't see anything claiming that Fuchsia is better.
In fact, Fuchsia has been done as a pretty low-key project for a while, slowly opening up parts but without much fanfare, just repositories being available, and slowly posting more documentation like the link in the OP. I don't really see very much marketing about it, just low key releases of code and more technical information to give people a taste and show the direction they're trying to go in.
I'm enthusiastic about Fuchsia, i really think there is a lot to gain by breaking with the old conventions, especially when you look at what is hindering true realtime computation approaches.
As a nice byproduct Google has a hedge against Linus dying and his replacement being incompetent at managing the community.
Right now it kind of reminds me of BeOS, which could do absolutely incredible concurrent realtime low latency media processing but was absolute torture to get a proper Web Browser working.
The problem with legacy support is that it drags in the braindamage you were trying to avoid by rewriting the OS in the first place. But without legacy support it's almost impossible to grow beyond the toy OS stage. It's the whole "So, what do I do with it?" factor.
from comments from people who worked on it (e.g., https://twitter.com/MCSpaceCadet/status/968666523425386497https://twitter.com/slava_oks/status/958908471801294850 ), Microsoft seems to have reached this stage with their Midori project. This was a ground up OS project based on the usual suspects from the research world i.e. object-capability security, microkernel architecture, a new lightweight process model, memory-safe systems language, zero-copy IO etc.; the project lasted 9 years and occupied over 100 senior engineers at its peak. They tried various strategies to run it on top of Windows, run it on top of Linux, run Windows on top of it, run it on Hyper-V, etc etc. before eventually giving up.
Google has the advantage of having two consumer platform (Android and ChromeOS) running on Linux, but with the majority of application code written on top of additional abstractions on top of Linux so a lot of it would still run if the appropriate runtimes were ported over to Fuchsia.
Google is also developing Fuchsia openly, if quietly, giving developers a preview of where they are going. Microsoft developed Midori mostly in secret, and while we've heard a little bit about what they did in that project from blog posts and Twitter threads, nothing official has ever been released publicly.
And on the server side, virtualization and containerization are huge and growing, as many people find the benefits in flexibility, repeatability of deployment, and incremental billing to outweigh the absolute overhead over bare-metal performance. Fuchsia seems to support virtualization as a core part of the API, allowing you to run Linux VMs when you need to, and it could be possible to port containerized applications piecemeal to Fuchsia with some components still running on Linux on co-located or even the same hardware.
I feel like this is one of the benefits of the relatively simpler POSIX style API design, and extensive use of sockets for IPC, over the fairly complex, heaviweight Win32 API and NT kernel. I feel like a lot more software that runs on Linux (and is generally portable to several other POSIX-ish OS's) would be a lot easier to port to Fuchsia, and in an incremental way, than all of the software that runs on Windows systems would be to port to Midori.
Shouldn't they have launched it as an alternative to Windows, for those users who want the benefits (reliability, security, whatever) rather than abandoning it? Make it free ($0) and open-source to encourage adoption. And why couldn't it run Windows in a VM for backward compatibility?
I personally agree, but the counterpoint is that this would have taken Microsoft's already confusing and uncertain product and platform roadmap and make it even much worse. I think those worries were absolutely founded even though I think it would've been worth it anyway.
That's a valid concern, but one caused by selling too many editions of the same product ( https://medium.com/@karti/office-365-was-too-hard-to-buy-so-... ) with artifical differentation, not completely different products with different intrinsic tradeoffs.
I had a roommate in college who had a BeBox with the twin PPC 603(?) processors and a row of LEDs on the front configured to show the current load of each CPU.
It was super cool and could make these absolutely insane multimedia demos, but he was forever trying to get software to work on it. Whatever POSIX compatibility it had was absolutely insufficient for modern (at the time) applications. Everything required some rewriting by hand, and there were definitely crashes. Worse, Netscape didn't release a BeOS version of Navigator so he was always hacking up the latest Mosaic release to try to get it working. I was running FreeBSD at the time and it was the polar opposite. Sound barely worked, the only video players were slow and unreliable open sourced school projects, but it was front and center on the newfangled Internet thing that was going around at the time.
>BeOS, which could . . . but was absolute torture to get a proper Web Browser working.
Ah, but that is exactly why I'm paying attention Fuchsia even though in general I ignore new OSes: it would not be hard for the organization behind Fuchsia to instruct the 1000-or-so developers it employs who maintain a web browser to make the web browse runs on Fuchsia.
Is Google actually good at community collaboration? I honestly don't know. But I would imagine that they require CLA and that would hinder the community. It may be a big problem when BDFL dies if there is no blessed successor, but what is a greater trade off?
Google did a great job starting the Kubernetes community, but they then contributed the project to CNCF (which Google also helped launch). Google remains the top contributor to Kubernetes, but even as their contributions remain substantial, their % of the total contributions is going down, due to the large number of other organizations now backing Kubernetes.
I think CLA is pretty standard for any open source project. I think the first one I ever signed was for Emacs. Emacs has done pretty well and the CLA doesn't seem to be putting many people off from contributing.
I imagine that it makes any potential litigation or dispute much simpler. Even if the license is FOSS, any sort of litigation could have hundreds or thousands of parties involved.
I find it interesting to note that the core fuchsia OS comes with "magma", "escher" and "scenic", which seems to be core OS services for composing one 3D scene across multiple processes ("shadows can be cast on another process without it knowing about it")
Is that a hint that Fuchsia is a VR-first operating system?
I don't think what Windows provides is really comparable. By my understanding (and the documentation you linked seems to back this up) WDM is responsible for taking 2D framebuffers from applications, and compositing them into a (possibly 3D) scene. Fuchsia's Scenic operates uses a 3D scene graph as its input.
Scenic will have support for what it’s calling “stereo cameras.” What this means is there can be two views into the same scene. Each camera can also be independently moved or turned for a different view. The most obvious reason for this capability is virtual reality.
From the sound of it, I assumed it was building up a 3d world scene graph composed by objects supplied by independent processes. Maybe I'm reading too much into it.
Yeah I think the "physically based 3D renderer" is a bit misleading because it makes you think "ah, like Blender or Unreal Engine 4", but actually I think they just mean they are doing window translucency and drop shadows in a physically correct way (e.g. there is an actual light source) rather than faking it. It is still used to render mostly-2D windows.
(This is my assumption anyway - I haven't tried running it)
I always figured this was to support Material Design, where the size of a drop shadow depends on the difference in elevation between the surface casting the shadow and the surface underneath.[1] A Material Design compositor would have to know the position of each process' surfaces in 3D space in order to render shadows correctly.
I want to downvote you for using VR first OS, which is technobabble, however I want to upvote you for sharing useful info about the topic. So I did neither.
Augmented reality pretty much requires us to shrink displays down to contact lens sizes because we already figured out almost nobody will pay to strap a smartphone to their face.
Then it requires us to create and popularize an entire new method of interacting with your computer.
Then this needs to be cheap enough to get in everyone's hands.
It would be surprising if this only took 20 years.
This is about drawing dumb drop shadows.
We could argue about the difference between OS and presentation layer but that battle may already be lost.
I have yet to see anything useful come from Fuchsia. There are tons of 'press release' type blogs, but nothing functional. The bundled steps to run Fuchsia inside qemu didn't work (and even shipped their own version of qemu in the scripts!)
I'm assuming the Fuchsia development is 100% about not having to use any GPL software. Look how hobbled the Android and ChromiumOS communities are compared to the Linux world at large.
As soon as someone outside of Google produces anything of any novelty around Fuchsia, I might change my mind, but for now I'm viewing it as a going-nowhere software project that's all hype and will never be 100% Free and Open Source.
> I'm assuming the Fuchsia development is 100% about not having to use any GPL software.
This. Everyone wants to distance themselves from the GPL. The reality is that the ideas the 90s open source movements were founded on are far from what we see today. We don't see OSS end user applications; at least not a lot in mainstream use. Instead, we just see OSS middleware.
In the early 2000s, people though one day we'd see Gimp be on par with Photoshop and StarOffice/Libreoffice take on Word and Excel. We've come a long way, but those ideas were never realized.
And by everyone you mean Google and other corporations, that explicitly disallow software with restrictive licenses, because opening up their own software, giving up lock in and actually competing is unthinkable to them. So not really about distancing themselves or the movement. Individual developers and small companies, on the other hand, can benefit from the most restrictive GPL licenses by selling commercial licenses to those corporations and not giving it away for free to them.
>>> the ideas the 90s open source movements were founded on are far from what we see today
Are they ? GPL is about protecting users' freedom. It's still a valid aim to me, probably even more so.
What has changed is that the web is much bigger than the desktop and so the GPL has less ground to grow on, so its effect may have been weakened. But only the effect, not the goal.
(I 100% admit I am more an idealist than a pragmatic)
By users' freedom you don't mean users who own patents, since they stand to lose their property under GPL restrictions. (Google Implied Patent Grant if that doesn't ring a bell.)
So most important corporations, etc. Copyleft was a very clever idea; but deciding to go to war against all intellectual property was a step too far. Immense billions have been spent already replacing GPL software with truly free software under a freer, more liberal license.
The software companies learned that they could benefit from open-source and collaboration on the infrastructural blocks and they eventually embraced it, which led to an increasing number of open-source project. But they don't like the GPL, which is a political instrument created by the Free software activists.
But those activists still exist, and their code is still written in GPL to avoid being “weaponized” by big corporations. The main difference is that they don't need to write all the tools they need on their own, since private corporations now release open-source code. And because of that, it's easier than ever to run with almost 100% free software on your desktop.
These people never _thought_ “StarOffice/Libreoffice take on Word and Excel”, they _wished_ there would be a political movement that make it happen, like socialists wish the Revolution occurs. And in the 90ies − early 2000s it really looked impossible, because the technical challenge was too big: everything had to be built from scratch to compete with a relentless private industry that kept moving forward: even the _programming language_ and the build tools were proprietary. RMS wanted to write a free printer driver, and first he had to the whole GNU stuff, with GCC and Emacs!
If you look where we are now, you can see it's way better ;).
ChromiumOS is a Gentoo fork. Speaking of which, I am a Gentoo developer and I like what the ChromiumOS team has accomplished. It serves a practical need. My neighbor uses it.
If Google wanted to avoid GPL software, they could have used a BSD UNIX successor. It certainly would have been far easier to do.
The biggest sin of Linux API remains ioctl (and its variants). Zircon commits the same mistake with its `object_get_prop` [1] and `object_get_info` [2]. If you pretend to be type safe (have different getters for different obj-types), you can in the long run replace these calls with in-userland static calls where possible to accelerate performance (like linux does for futex and time).
Instead you get his "It does A if you give it B, it does C if you give it D" this is pretty bad API design as it NEEDS a void pointer. I'd rather see _a lot_ of simple with numbers related to the call. You have 4 million of them FFS (if you care about 32bit compatibility).
It just leaves a bad taste in my mouth. The API design is extremely nice otherwise, and these methods feel like such an after thought.
---
To be clear I really don't care about POSIX compatibility, its easy to shoe horn in after you have a solid OS. The Windows-NT kernel has done it twice now (NT4.0 and Windows10).
A lot of people say "capability based" and really mean some very fine grained access control system. (A confusion encouraged by POSIX "capabilities".) What I hope that they mean is the one that solves the https://en.wikipedia.org/wiki/Confused_deputy_problem
There are two VERY different meanings of the phrase. The one that I'm hoping for can be thought of like this.
I think the term "capability based" is explained well enough in the talk "Dive into Magenta – fuzzing Google’s new kernel" (https://youtu.be/aYZCiLI-LZM?t=18m9s).
One thing I don't see addressed in the README is why? Why do we need Fuchsia? What problem are we trying to solve? Why should I use/develop for it instead of Windows/Linux/macOS?
Or is this just a research operating system designed to test new ideas out?
The main keywords are "capability based" and "microkernel". Those ideas bring numerous advantages over monolithic kernels (including Linux, Windows, macOS), especially humongous boost to protection against vulnerabilities, also better reliability and modularity. They are quite well researched already AFAIU, and apparently the time has come for them to start breaking through to "mainstream" (besides Fuchsia, see e.g. https://genode.org, https://redox-os.org)
Other than that, for Google this would obviously bring total control over the codebase, allowing them to do whatever they want, and super quickly, not needing to convince Linus or anybody else.
And does a microkernel have anything to do with the ultimate capabilities of the machine or is it specifically targeted at embedded / smartphones / hypervisors, and not for running a regular server or desktop OS?
I'm afraid I don't fully understand the question; would you care to try rephrasing? Anyway, as to what I seem to understand:
- "capabilities": ok, I think I get the misunderstanding. The "capabilities" here are completely unrelated to "hardware capabilities" or "machine capabilities", a.k.a "what features does my phone have". The word has a totally different meaning in the technical jargon of OS development. It's a security architecture concept; as a first approximation, I'd say "capabilities" are somewhat akin to "permissions" on Android/iOS. You could maybe call them "permission lease": as an app, if you got some permission ("capability token"), you can choose to sublease/share/extend it to another app you run. See: https://en.wikipedia.org/wiki/Capability-based_security
- microkernels are a general architecture approach; they can perfectly well be used for regular server/desktop OS. It's just that writing a new OS for an embedded system is easier as a "first step", because you can start smaller. A full-blown "general purpose" server/desktop OS is much more complex; enough to say that it must have very wide driver support for shitloads of different hardware existing in the world. (But there are also other challenges, like multiple user management.) Microkernels were long believed to have worse performance than monolithic kernels in popular perception, thus their historical unpopularity. However, this notion was challenged by the L4 kernel (https://en.wikipedia.org/wiki/L4_kernel), which was I believe the reason for the revived interest. L4 was apparently published ~1993, and there was QNX long before that; I'm not actually sure why it hasn't become more popular earlier. Maybe unfamliarity to programmers? I believe microkernels enforce somewhat stricter development standards than monolithic kernels (where it's probably easier to "just hack around" and duct-tape a new feature), thus probably raising the perceived development costs somewhat (I believe it's similar as if we compared e.g. Rust vs. C/C++).
edit: ah, a good example could be the L4-based L4Linux (https://en.wikipedia.org/wiki/L4Linux) kernel, which is said to be a more or less "drop-in" replacement for the "classical" monolithic Linux kernel, so you should be able to run any Linux distro on it. Though I personally never tried it (yet).
The GenodeOS folks are also working towards building a usable "general purpose" desktop OS based on a microkernel, see e.g. chronologically:
As far as I'm aware Google have never actually said what they intend Fuchsia to be for. It may never become a product, or it might be the long-fabled-in-the-tech-press replacement/merge for ChromeOS and Android.
It's entirely possible Google don't know either, they're just trying some things out and doing it in the open for whatever reason, and don't want to publicly commit to anything.
It certainly gets them some interested coverage from time to time.
As I can see, they use a microkernel architecture for the kernel[0]. I wonder why they need to create another microkernel OS and not re-use an existing one like MINIX 3 or QNX? What are the advantages of the Zircon Kernel compared to the MINIX 3 or QNX?
Eh, the NT handle table and Unix file table are very nearly the same thing these days.
I'd say it's more like using the per process FD table for everything, rather than having global tables like for PIDs. Which is really cool, IMO. Containerization is all about adding indirection to the global tables, but of there are no global tables, you get all of that for free.
Linux is still about opening, reading, and writing to files, it tries to keep a common interface; it's all about read() and write(). Looking at fuchsia's system calls, it seems to have a different interface for each types of objects.
NT is weird beast from this point of view because of how much of process state (in the sense of "on unix this is some kernel structure") is in fact implemented purely in userspace.
No, they really are not the same thing, not even nearly. You yourself mention one class of handles for which Unix has no equivalent. (FreeBSD process descriptors have some gotchas that make them significiantly different; and Linux procfs descriptors do not match behaviours when it comes to the object lifetimes and handle waitability.)
Both Minix and QNX are handle based microkernels. Probably only one "everything is a file" "microkernel" is Plan9/Inferno, which is generally not considered a microkernel.
Soo, given that all important device drivers in the Linux kernel used in Android are closed, I'd be curious to hear from Google if their new Fuchsia is going to solve that problem.
This may seem trivial, but closed device drivers make 100% impossible neither to update them to more modern versions once the Android version is declared obsolete, nor install natively a different operating system on the device. This practice, security concerns aside, is responsible for a huge load of old -otherwise perfectly usable- devices being scrapped in landfills.
So, dear Google, will you keep the lowest, smallest but most important layers of the OS open or rather will prevent people to do what they want with the devices they purchased even at the cost of contributing to more pollution?
Google does not regard closed drivers as a problem. With Fuchsia, all the important drivers will be closed, full stop. Google has no incentive to care about your landfills. The permissive license of Fuchsia will make its ecosystem more appealing to IVI and mobile OEMs, not less.
E-waste is just one of the problems, probably the only one most users could understand; but security is also a huge one. With apps requiring permission to essentially everything (and users surrendering them blindly) security on any mobile device as of today is a myth. Open Source apps could mitigate the problem by swapping an unreliable layer with a trustworthy one, but we still have closed device drivers which could contain whatever their manufacturer (or its government) wants without any chance of being audited.
If the device drivers are stuffed into user space using stable ABIs, it would be easy to update the OS while leaving the driver alone. That can even be done with kernel drivers. I believe that Windows does it. The reason that cannot be done on Linux is that Linux elected not to have a stable driver ABI.
I'm not familiar with driver development, but could it be possible to rewrite a driver's machine code so that all interactions with the kernel are redirected through a translation layer that emulates the old kernel's ABI using the new kernel?
My understanding is that Fuchsia will have a stable ABI to avoid the driver issues of Android/Linux. That's not an answer to open source drivers of course. That falls on the OEMs.
I also don't approve of the wasteful short lifecycles, however I don't think they are only caused by device drivers being closed source. People want to buy new devices, not only because of the outdated software, but also because the hardware has improved so much in the last years, and because they enjoy buying.
That is absolutely true, especially in "western" richer countries, but don't forget other poorer places where most people don't feel the need to get the latest gimmick to climb the social ladder by impressing friends, but rather to communicate in the cheapest possible way.
How about some discussion of Fuchsia itself, instead of "why reinvent the wheel" or "Linux is bloated"?
From my reading so far, it looks like Fuchsia takes some of the better parts of the POSIX model, the way file descriptors can be uses as capabilities, and extends its usage more consistently over the API so that it is used in a lot more places. In Fuchsia they are handles, which are arbitrary 32 bit integers, but they act a lot like somewhat richer, more consistent file descriptors. You can clone them, possibly with limited rights like read only, you can send them over IPC channels, etc.
There are some differences, in that there's no absolute path handling directly in the kernel; instead, you always open relative to some particular file handle. A process may be given two that can be used to emulate POSIX style path resolution, by being given a root directory handle and a current working directory handle, though there may not be a guarantee that one contains the other; it sounds like more commonly application will just be given handles for the files or directories they are supposed to access, rather than having everything go through path resolution.
Signal handling is done consistently with waiting for data on handles; handles just have various states they can be in, like ready for reading, ready for writing (for file handles), running or stopped (for process handles), etc, and you can wait for changes of state on any of a variety of different handles.
Memory mapping can be done by allocating a virtual memory object, which you could either not map (treat it as an anonymous temporary file), write to, and then pass to another process, or you could map it into your process, manipulate it, clone the handle, and pass that to another process. Basically seems like a cleaner design for shared memory handling than POSIX, though something a lot like it can be done in Linux these days with anonymous shared memory and sealing.
Jobs, processes, and threads are also all handles. Jobs contain processes and other jobs, and processes contain threads. Jobs group together resource limitations (things like limits on numbers of handles, limits on total memory used, bandwidth limits, etc), processes are separate address spaces, and threads are separate threads of execution in one address space. The fact that jobs and processes are all handles, instead of IDs, means that you don't have to worry about all of the weird race conditions of trying to track things by PID when that PID may no longer exist and could be reused in the future.
An interesting part is how program loading happens. In POSIX like OSes, you fork your process, which creates a clone of the process, and then exec, which asks the kernel to replace the running program with one loaded from another file. You give the kernel the path to a file, and the kernel calls the dynamic linker on that path to link the shared libraries together and then execute the result. In Fuchsia, you just build the new address space in the parent process, and then ask the kernel to start a new process in that address space, with execution starting at a particular point in it and some parameters loaded into particular registers. This basically means that the dynamic linker will now be done by a library call in the parent process; which could be really advantageous for those processes that fork the same executable as a subprocess many times, as they can link the executable once into some read only pages, and then very quickly spawn multiple processes from that same already linked program. I'm sure that ld.so and friends on Linux and other POSIX-like OSs have a lot of caching optimizations to make this faster, but it sounds to me that the Fuchsia model of just having the parent process do the linking as a library call could be a lot faster.
(edit to add: hmm, upon further reading, it looks like they expect process creation to happen from a single central system process, rather than providing the dynamic linker API, "launchpad", as a supported API; but for now it looks like you can use the launchpad library)
It basically looks a lot like what you would wish the POSIX API worked like with a lot of hindsight. A lot simpler and more consistent, and does a much better job of "everything is a file" than the POSIX API ever did (of course, it's "everything is a handle," but that's fine, the point is that there's one consistent way to work with everything).
>A lot simpler and more consistent, and does a much better job of "everything is a file" than the POSIX API ever did (of course, it's "everything is a handle," but that's fine, the point is that there's one consistent way to work with everything).
I am failing to see how this is more consistent. With UNIX, because everything is like a file, you operate on them the same manner. A file, a socket, a pipe, shared memory, ... you open them then you use the system calls for operating on files: read(), write(), poll(), dup(),... which then allow you to use operations built on these syscalls such as fprintf, fscanf,... but also all the tools like cat, head, grep,... This is what i would call consistency.
If i implement a new feature as a file in linux, for example a virtual filesystem like /proc/, all the cited operations would already be available out of the box.
But this is how Fuchsia is as well; these handles are pretty much equivalent to file descriptors, except for how they get numbered/allocated (though for C library compatibility, there is a per-process file descriptor table to map between file descriptors and handles).
Even on UNIX like systems, you can't read or write on every file; for instance, you can open directory, but you can only readdir on that, not read from it. But they are still file descriptors like everything else, so you can call dup(), fstat(), pass them between processes on Unix sockets, etc.
There are plenty of other operations which can only be done on certain types of files in UNIX-like systems; for instance, you can only recv() or recvmsg() on a socket.
The difference is that in Fuchsia, more things have handles, and so more things can be treated consistently. For instance, jobs, processes, and threads all have such handles; so instead of getting a signal that you have to handle in an extremely restrictive environment in a signal handler or having to call wait4() to learn about the status of a child process, you can just wait on signals to be asserted on the child process using zx_object_wait(), which is the equivalent of select() or poll(). This means no more jumping through hoops to get signal handling to work with an event loop; it just works.
Of course, the other difference in Fuchsia is that there is not a single namespace. Every component in Fuchsia has its own namespace, with just the things it needs access to; there is no "root" namespace. This is good for isolation, both for security reasons and reducing accidental dependencies, though I do wonder how much of a pain it would make debugging and administering a system.
My point was that with UNIX, while you have specialized operations like recvmsg, you still have read() and write() acting as an universal interface.
If you look at Fuschia system calls, you would see
vmo_read - read from a vmo
vmo_write - write to a vmo
fifo_read - read data from a fifo
fifo_write - write data to a fifo
socket_read - read data from a socket
socket_write - write data to a socket
channel_read - receive a message from a channe
channel_write - write a message to a channel
log_write - write log entry to log
log_read - read log entries from log
It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.
Hmm. On UNIX read() and write() are not universal; you can't use them on directories, for instance, nor can you use them on various other things like unconnected UDP sockets.
Treating everything like an undifferentiated sequence of bytes can cause impedence mismatches; each of these types of handles has very different ways that you work with it. For instance, a VMO is just a region in memory. A FIFO is a very small queue for short equally sized messages between processes. A socket is an undifferentiated stream of bytes. A channel is a datagram oriented channel with the ability to pass handles. The log is for kernel and low level device driver logging.
In fact, it looks like the Zircon kernel has no actual knowledge of filesystem files or directories; they are actually channels that talk to the filesystem driver (another userspace process) over a particular protocol.
The thing about having one single universal interface like read() and write() to a lot of fairly different things is that they each actually support different operations; you can't actually cat or echo to a socket (not without piping into nc, which does that for you). Or you can't just echo data into most device files and expect it to work; some of them you can, like block devices, but others you need to manipulate with ioctls to configure properly.
What Fuchsia is doing here is acknowledging the different nature of the different types of IPC mechanisms, and so giving them each APIs that better matches what they represent. A VMO can be randomly read and written to; none of the others can. A FIFO can only accept messages in an integral number of equal size pieces that are smaller than the FIFO size, which is limited at a maximum of 4096 bytes; it is used for very small signals to be used in conjunction with other mechansims like VMOs. A socket provides the traditional stream abstraction, like a pipe or SOCK_STREAM on UNIX, in which you can read or write new data but can't seek at all. A channel provides datagram based messages along with passing handles.
One of the big things that I think the Unix model makes hard is telling when something is going to block; because read and write assume that the file is one big undifferentiated blob of bytes, it can be hard to tell when it's safe to do so without blocking. On the other hand, each of these is able to have particular guarantees about what you can do when they report that there is space available.
I admit that the log ones seem redundant; I would think they would make more sense as just a particular protocol over channels. I don't see any reason for that one to exist separately.
I wonder why you would think it would be better to have one interface that isn't an exact match for a lot of different IPC types, than separate specific interfaces that match them? They are all tied together by being handles, so you can dup them, send them to other processes, and select on them just the same, but the read and write operations behave quite differently on each so having an API that reflects that seems reasonable.
If you like to think in object oriented terms, think of them as subclasses of handle. If you like to think in terms of traits or interfaces, think of there being one generic handle interface, plus specific operations for each type of handle.
The "every thing is a file, and a file is an undifferentiated bag of bytes" is in some ways a strength of UNIX, but in other ways a weakness. You then have to build protocols and formats on top of that, kernel buffer boundaries don't necessarily match up with the framing of the protocol on top, and so on.
And all it takes to give you the power to manipulate things in the shell is appropriate adapter tools. Just like nc on UNIX allows you to pipe in to something that will send the data out on a socket, you need some adapter programs that can translate from one of these to another (and from filesystem files, since those don't even exist at this abstraction level); of course, in many cases, you're probably going to need some serialization format for things like channel datagram boundaries, and there are some things that just can't be translated from a plain text bag of bytes (like handles).
But you can't actually treat everything like a file, so you have to turn to ioctl and whatever structure is actually related to what you're working with in reality.
In a sense, we only treat everything as a file the same way some languages have a toString method for everything. You can get something out of it no matter what it is, but there's no guarantee that something is going to be in any way useful and you still have to interact with it in a way that doesn't treat it like a generic string.
The issue that GP was pointing out is that in Linux, everything is not a file. And the inconsistencies (in Linux or the POSIX model; take your pick--I'm not interested in splitting those particular hairs) are a hassle to program around. For example:
- Non mmap'd memory: what do you do with memory retrieved by [s]brk(2)?
- Signals: are they files or not? Signalfd is provided in addition to lots of other complicated other facilities for handling around the same things. And that's before getting into realtime signals and the like.
- Timers/events: are they files? vDSOs? Available via files? No? Depends on which syscalls/APIs you're using.
Additionally, many of the non-file-ish APIs in POSIX implementations are a massive hassle to use across multiple threads. Even the file-based APIs for eventing (select/poll/epoll.) have substantially different behavior with regards to edge/level triggering, and different behavior when used across multiple processes (shared fia fork or non-CLOEXEC handles) or threads. Can you master them and use them correctly? Sure. But there are so many ways to achieve very similar things, and, while each method has its niche, there's no unifying thought model that ties them together in the same way that "everything is a file" was promised to tie together UNIX operating systems.
Oh, and /proc is being moved away from/found to have limitations because it's a file descriptor exhaustion hot-spot, among other things. So the plan9-style of "expose the internals as a filesystem" appears not to be the route that the Linux community is pursuing.
TL;DR if you're writing simple, non-concurrent standalone utilities or working at a very high level then sure, you can pretend everything is a file. Below that, or in more complex applications, those abstractions break down in surmountable-but-annoying ways.
Thank you for writing this it was just what I wanted :)
What is your sense of the latency guarantees/scheduling to user space graphics and input? This is an area that everything else kind of fails at. "No drawing the groundplane and every possible frame for this user you have in your VR grasp is not optional".
No idea about the latency guarantees or scheduling. I've mostly just read and summarized some of the overview docs, and there's been nothing written on scheduling or latency that I've found yet.
It looks like the isolation of different applications, and capability based security, is pretty baked in. There do seem to be some TBD parts, like right now just like on Android apps can either have access to all of /data or none of it, and fixing that is something they list that they want to do but haven't yet.
Almost none of what you discuss is novel. As others have pointed out, this was the sort of innovation that people were coming up with in the 1980s, and has long since been expressed in systems like GNU Hurd and Windows NT.
Multiple signallable states per handle, rather than a single bit's worth of "signalled", is slightly novel. It's a fairly obvious generalization of WaitForMultipleObjects once one realizes that there are multiple ways in which one can wait on a handle. It's one that I implemented in a hobby operating system about 10 years ago, and I certainly didn't consider it groundbreaking.
That program loading mechanism isn't novel, contrastingly. Again, I did much the same for my hobby operating system. It's almost an inevitable design given the a desire to support an API with "spawn" (as opposed to "fork") semantics. And it has been in Windows NT all of these years, which from the start had to provide underlying mechanisms to support both models as required by its POSIX, OS/2, and Win32 subsystems. One can create a process object as a blank canvas and then something that has the handle can populate it with what are to be its program images; or one can create a process object that is "copy constructed" (as it were) complete with program images from an existing process. Your consequent forking-of-a-spawned-template idea for worker processes is interesting, but consider that an operating system capable of doing that has existed for quarter of a century now, and people haven't really made use of it in the real world. Not even the Cygnus people. (-:
More interesting from the operating systems design perspective are things that you've overlooked. One of the things that also happened in the 1980s was a reinvention/replacement of the concept of a POSIX terminal. OS/2 got some VIO, MOU, and KBD subsystems, and the applications-software level concepts of directly-addressable video buffers and queues of mouse/keyboard input events that encompassed all of the keys on a keyboard including function/editing keys. Windows NT took that further with its "console" model, unifying mouse, keyboard, and other input events into a single structure and unifying input and output (albeit with some kludges under the covers) into the waitable handles model. GNU Hurd contrastingly retains the POSIX terminal model, albeit that it is all implemented outwith the kernel, without even a pseudo-terminal mechanism or a line discipline within the kernel, and the console dæmons do have cell arrays that could in principle be accessed by applications softwares.
It's worth considering what design choices Google et al. have made in this area.
Then there are other lessons to learn from Windows NT. WaitForSingle/MultipleObjects having more than 1 bit's worth of "signalled" is one improvement that hindsight yields, as I have mentioned. Another is the lesson of Get/SetStdHandle. The Win32 API only supported three handles. And so language implementations that wanted to provide POSIX semantics in their runtime libraries had to implement the same sort of bodges with "invisible" environment variables that they did to make it appear as though there was more than a single current directory. They implemented their own extensible Get/SetStdHandle mechanism in effect, which only worked for coöperating runtime libraries, when it would have been far better for this to be provided by Win32.
Again, it's worth looking to see whether Google et al. have learned from this and provided a common language-neutral descriptor-to-handle mapping mechanism, and a means for that table to be inherited by child processes.
There is no documentation at all but there is a few application examples in /app/. The bindings for fuchsia and zircon (kernel) are in /public/dart-pkg/
It seems reasonable to want to build a modern OS from scratch; Linux has survived and adapted remarkably well, but it has many aspects that are rooted in the past and just can't be shaken off. The world has changed a lot. I just hope Fuchsia remains truly open-source and doesn't become a power-play by Google.
It has been an interesting read of the many different points of view in support of C or of C++. But there is an elephant in the room here.
The problem with most languages, including the ubiquitous C, C++, Java et al, is that there are implementation defined behaviours and undefined behaviours that are specifically placed in these languages.
A previous discussion, which I can't locate at the moment, did discuss this in detail. Most programmers have a serious flaw in that they do not document. They may produce documents but they do not document. Every assumption, every trick and why it is used, every implementation defined behaviour, every reasoning as to the use of specific algorithms should be documented and is not.
I have seen incredibly detailed documents for programs that just miss some of the basic essential assumptions because "everyone knows them".
In everyday communications, we use language in a dynamic way, meanings can be changed subtly and we get around the errors. With the programming of machines, there is no such leeway ever. Our languages should be defined completely so that we will know that what we have written has actual meaning.
the reality, of course, is that this is a "pipe-dream" and won't happen. But as programmers, we could start calling for such completeness of definition of the languages we use.
People here seem to be amazed that this project is in C++, rather than a simpler language (C) or a more modern language (Rust). But you must notice that this is a Google project, and Google writes many many projects, internal and external in C++. It almost never writes in plain C, and has no penchant for fancy new programming languages. You may disagree, but Google doesn't care.
Well, "GNU is Not Unix" and that worked out OK. On the page it says it is "POSIX lite", so it likely be recognized a rather Unix-like, and and a number of things may likely end being able to be compiled on it with few modifications due to the POSIX-like environment. The `brew` project on macOS would be a related example.
Microkernel, huh? At least now we will have a chance to get some empirical evidence to resolve the famous Torvalds - Tanenbaum debacle. It's going to be interesting to see how Fuchsia pans out and in what kind of environments it can be used.
You have to be kidding right? Microkernels have been used in production for decades in situations where safety, accuracy, and robustness are key. QNX is a microkernel used in real-time systems to great avail and great stability. Unlike linux, it's the sort of kernel you could really trust to run important infrastructure, like automobiles, unmanned aircraft, high speed trains, and robotic surgery.
Just because you don't consciously interact with them on a daily basis, does not mean they do not exist. My guess is that QNX and kernels like it are the reason you can take for granted such obvious things as your car braking correctly, your train not derailing, and your robot surgeon not crashing.
In today's world, we're so used to software breaking that if it doesn't break, we oftentimes think it must not exist. But, there is a whole world of reliable software out there that can actually be trusted. You don't hear about it often, because they work so well.
Well, one of the things I hope we get to see with projects like Fuchsia and Redox is what the performance difference is against a monolithic kernel like Linux, if/when they have been well optimized.
ART is a thing since Android 5.0 and it has become quite good.
Even the parts written in straight C and C++ code have issues in performance woes on Android, as everyone that has wanted to do real time audio on Android painfully knows.
Speaking of performance issues and architecture messes, have you ever tried real time audio on Windows Phone? The absence of real time audio apps speaks volumes.
I was missing your blind Google advocacy and Microsoft hate.
Did they fired you?
If you had any Android developer experience, you would surely know that Google had a few failed attempts at real time audio, needed help from Samsung to implement them, and the final API was C only, with devs asking for a C++ one, wich was later dumped on github as side project and isn't part of the official NDK APIs, but alas you don't.
>I was missing your blind Google advocacy and Microsoft hate. Did they fired you?
I'm just trying to correct all of the misinformation you like to post about Google and Android.
>If you had any Android developer experience, you would surely know that Google had a few failed attempts at real time audio, needed help from Samsung to implement them.
Unfortunately, your lack of Android development experience and your lack of exposure to development on Samsung devices has caused you to be disingenuous once again. Samsung didn't help Google nor did they contribute any of their code to AOSP. They implemented their own proprietary audio solution called SAPA. Unfortunately, it was limited to their platform and the audio latency wasn't very good in comparison to the iPhone.
>and the final API was C only with devs asking for a C++ one, wich was later dumped on github as side project and isn't part of the official NDK APIs, but alas you don't.
AAudio is indeed coded in C, but at least you can use the Oboe C++ wrapper. What were the low latency audio solutions for Windows phone again? Oh that's right, there weren't any. No wonder there were no low latency audio apps on that platform.
>If you had any Android developer experience, you would surely know that Google had a few failed attempts at real time audio, needed help from Samsung to implement them, and the final API was C only, with devs asking for a C++ one, wich was later dumped on github as side project and isn't part of the official NDK APIs, but alas you don't.
Would you mind to explain what happened completely? I am very curious about it.
The announced C++ wrapper ended up being an external project on github, detached from the NDK, that you need to integrate yourself, without any guarantees how long Google will care to maintain it.
Oh, to make things even better, it is already clear from Android P draft documentation that new audio APIs are on their way.
Not kidding, but granted, it would have been fair to say that we can now get more evidence. I knew about QNX, and Minix, Symbian and GNU Hurd are also microkernels.
It's just that despite all the good arguments for 30+ years, microkernels somehow seem to have failed to reach their full potential. That's why it's going to be interesting to see how Fuchsia pans out, being written from the scratch with all the cumulated knowledge and experience. Not to mention Google's resources and talent pool.
Like most monolithic kernels, most microkernels have failed.
Ultimately, kernel popularity is largely going to be a network effect. Linux is popular on servers because it is both free and popular; not really for any inherent technical merit. Windows is popular because no one loses their job for choosing Microsoft.
On the other hand, there are several microkernel operating systems with notable features and significant, dedicated fan bases. For example, AmigaOS is well-known for its responsiveness, and despite not being in production for decades, still has a dedicated fan base. QNX is used in a lot of critical, performance-oriented workflows. OS X is a partial microkernel as well.
Nevertheless, the only microkernel that has really failed is GNU/Hurd. Minix fulfilled what it set out to be (educational) and Symbian was successful enough on mobile.
Good points. Nevertheless, it's still puzzling why microkernels have done so poorly, despite them being superiour in theory. Almost all the ones mentioned here have waned into a curiosity or niche, and maybe also QNX starts to get a bit long in the tooth, having been released in 1982. OS X of course is another story, but also based on 1980's ideas with the Mach kernel. Which is why it's interesting to see another open-source contender to appear from a clean slate.
If I remember right, Linus's main argument against Tanenbaum was that the overall state gets so dispersed between the various services that make up a full-fledged OS, that eventually a microkernel-based system just gets unpredictable or starts performing poorly. I think the discussion was inconclusive, although for me Linus's argument sounded intuitively convincing.
If the argument here is that microkernels have failed because they are incredibly successful in two very important niches, then I feel like you are applying a double standard.
Ultimately, Linux also is not all that popular except in two niches: servers and cell phones.
Windows is also not popular except in a niche: desktop computing.
Ultimately, you have arbitrarily declared several very prevalent forms of computing to be 'niche', while declaring several equally prevalent forms to be 'popular'. One of my hypotheses above is that we tend to dismiss reliable forms of computing as 'not really computing', because we expect computers to fail. When computers work, we think of them as mechanical machines or devices (the way we might view eyeglasses for example). I feel like your subjective opinion of what is and isn't niche just further demonstrates this point.
My guess is that there are more instances of QNX (and systems like it) than we realize. There could easily be as many as Windows or OS X (almost certainly the latter, I feel).
That is not my argument, because I do not think microkernels have failed. Sorry, if it came out that way. I do not have a strong opinion one way or the other, just curious if there is something to what Linus said and more than meets the eye. As it still seems to me, that microkernels have not fared quite as well as could have been expected thirty years ago, when the ideas first crystallized.
You are totally right that microkernels have been very successful in many different areas. I am afraid to even mention, that to me it seems they are most often pretty specialized areas with targeted applications and hardware choices. Which is why I'll welcome Fuchsia, especially if it is meant to be a general purpose, open source OS for a wide area of applications running on commodity consumer-grade hardware. That's all.
Have they done so poorly? There aren't that many mainstream OSes, period, and there's not a lot of a space for a new OS. In embedded computing, various commercial sectors such as aerospace, and the military, there are highly successful microkernels most people don't know about -- QNX is the most well known example, but there are also others such as Integrity, and L4 and its many variants are heavily used in embedded settings. L4 runs on tons of small devices, including the iPhone, and is probably the most successful microkernel in existence. Not "niche" by any stretch.
> If I remember right, Linus's main argument against Tanenbaum was that the overall state gets so dispersed between the various services that make up a full-fledged OS, that eventually a microkernel-based system just gets unpredictable or starts performing poorly.
Well, large parts of the Linux world ended up with systemd, and I haven't yet met a person who had NOT had his fair share of issues with it.
Look up the Blackberry Playbook vs iPad comparison they did a while back. Playbook was QNX-based. They were browsing with some games running IIRC. There wasn't any lag on the switches. Being a RTOS, it stayed responsive.
The BeOS demo toward the end is similar where they fire up a load of 3D apps, videos, etc with the system gradually slowing but still handling the load. That was on mid-90's hardware.
EROS was a capability-secure microkernel with persistence for saving or restoring running state of system. One thing they liked doing at demos was unplugging it during file transfers. After it restarted, the system would pick up where it left off.
In security side, there's been a few security and separation kernels to never have a reported breach in the field. They passed 2-5 years of pentesting by evaluators, too, doing many times better than monolithic kernels in defects found. Also easier to mitigate covert channels or do formal verification since they're small enough to analyze to begin with.
All the reporting I've seen on it said it had a microkernel. I've included both the first thing from Google and my UNIX Alternatives List that has an in-depth link on BeOS.
It also had the kind of compile-time slowdowns one expects with a lot of context switches on older CPU's. So, what's your source that there's not a microkernel at the center?
I appreciate you replying. Since it's you saying it, I'll stop saying it's a microkernel for now. I am curious why people said it so much. Maybe you could help there in even there was a sensible reason for the misinformation.
Did it have nothing of the sort? A microkernel-like component in center of monolith like NT? Microkernel architecture with key stuff in kernel mode for performance like hybrids do? Did you see anything fitting the term in any BeOS release? And how would you classify its architecture?
I'd classify the BeOS kernel as a modular-ish monolithic kernel. I'm a bit fuzzy on some of the more precise details, like the network stack (which they rewrote 3 or 4 times...) but I know that all drivers, at least, ran in ring 0.
Haiku's kernel is also monolithic, although we have a few hybrid tendencies (FUSE, but it's not widely used) and even more modularity than BeOS had, but we are still mostly source-compatible with the BeOS kernel.
I don't know why people say it's a microkernel; people say the same of Haiku, but it's just not the case. Perhaps it's because the dynamic kernel module system is built-in and virtually all kernel modules are stored as shared objects and loaded on startup, vs. Linux's more static approach, at least historically? That's not at all what's meant by the term "monolithic", but popular perception is often wrong...
Alright, that makes sense. I even suspected key drivers were in kernel mode given the performance on its kind of hardware. They might be saying it just because they don't know what a microkernel is. It might also be some new usage based on smaller, more-efficient kernels like used in containers. Who knows.
For what it is worth, there are more microprocessors running microkernels than are running monolithic kernels. The Intel ME is running Minix. Your cell phone’s baseband is likely running L4. QNX is used in cars. The Xen microkernels is used on AWS. VMWare ESXi is also a microkernels. The list goes on.
Linux is in places that have higher mindshare than the places that microkernels occupy. When was the last time you thought “I wonder how the hypervisor powering AWS is designed” or “I wonder how my cellphone’s baseband is designed”. The latter being a question that I wish people would ask.
> Microkernel, huh? At least now we will have a chance to get some empirical evidence to resolve the famous Torvalds - Tanenbaum debacle.
They're generally used for different purposes, but if you mean number of deployed systems, then microkernels have won hands down. Firmware in nearly every device runs a microkernel like L4. Hypervisors like VMWare and Xen are also microkernels.
Microkernels are everywhere, it's only the desktop that's still dominated by monoltihic kernels (Mac OS X's Xnu/Mach is not a microkernel).
I mean, Xen is literally descended from exokernels. The whole point of an exokernel is to securely multiplex hardware with as few abstractions as possible (which makes for a pretty sweet hypervisor it turns out). It's original paravirtualized MMU is straight out of XOK and AEGIS.
VMWare calls their system a microkernel (and so Wiki's rules therefore repeat that without other research), but the stack trace on the ESX wiki page pretty clearly shows a network card driver being called inside kernel context, so they have to be playing fast and loose with the term. From the outside it looks like earlier Vxworks; it's somehow a microkernel despite all of the code living in the same address space with no real permissions differences because it has multiple kernel threads and loadable modules. By that definition Linux is a microkernel too.
> I mean, Xen is literally descended from exokernels. The whole point of an exokernel is to securely multiplex hardware with as few abstractions as possible
Xen has far too much in-kernel code to be considered an exokernel. Linux on L4 had even less overhead than paravirtualized Xen. If exokernels truly efficiently multiplex the hardware in the most minimal manner possible, this wouldn't be possible.
I can't comment on VMware beyond published documents since I'm not as familiar with it.
VMWare ESXi relies on a guest to provide drivers. Seeing a network stack in kernel context makes sense given that the process handling the driver is a VM. The VM used to be Linux. I recall them being sued because someone thought that violated the GPL.
They wrote a compatibility layer and run modified Linux drivers inside their hypervisor. That's separate from the Linux kernel that they use to boot and subsequently manage the hypervisor.
Look at OKL4 or Nova microhypervisor for what a microkernel version of Xen would look like. Xen is relatively large with pile of monolithic code in dom0. Microkernel vendors, esp OK Labs, were poking at its huge TCB for a while.
Thanks for reminding about NOVA! I had totally forgotten about it. Doesn't seem to have much activity since 2010 though, unfortunately. I wonder why Google does't just build on L4 or something like Nova instead of starting from scratch. These systems are low-level and general enough to do what they want, possibly better than they could achieve themselves. L4-sec has had 2 decades of research put into it, 3 decades if you count is predecessor L3.
Of course, the number of hardware and firmware type of attacks on mainstream platforms made NSA retire the SKPP for doing separation on them. Even a perfect hypervisor cant be trusted to maintain that when its dependencies are garbage. That was even noted in 1992-1993 papers on VAX Secure VMM. Even secure digital desigms might have analog or RF attacks.
So, the current recommendation is to build things that way just to reduce number of attacks with monitoring and recovery as usual. Those two, though, could run on simpler, verified platforms. There's a lot of precedent for such split architectures.
Why use a new OS from a company with historically bad customer support, will likely report everything you do back to google HQ for analytics and frequently abandons projects once developers get tired of it. Sounds like a computing nightmare, I'd be very hesitant to voluntarily use it.
That's why I said voluntarily. Realistically there are 2 cell phone OS's to choose from, Android and iOS. Android is broken a lot of the time and people are forced to either find workarounds or do without functionality. I dont use Chrome but Firefox. But the masses will use it if its forced on them.
>Android is broken a lot of the time and people are forced to either find workarounds or do without functionality
Continuing to see this parroted is about as silly as the "Android OS includes ads" which is another comment that lets me know the author hasn't actually used the platform for themselves.
Im speaking from personal experience. From not being able to write to the SD card with updates to the Gmail Outlook/Exchange Sync being broken in Oreo, to gmail messages being delivered hours or days late due to doze and GCM messaging. It's not parroting when its true.
Could anybody please explain me why microkernels are so great, when practically they do nothing more but push the overhead of switching threads to extreme? Basically, everything that program does requires waiting for scheduler to execute whatever service we send messages to. Disk, sockets, devices - everything. All of which is made in the name of memory safety.
On the other hand, unikernels that execute nothing but managed code (e.g. not code of CPU, but code for some virtual machine such JVM or .Net, which is forbidden to read other processes' memory on the syntax level) solve the same problem of protecting system memory, while carrying much less overhead. I guess this approach would be more preferable for creating a new mobile-oriented OS that requires good performance and low power consumption, no?
It has taken decades to make linux stable and relatively bug free (Discounting systemd) As much as it would be great to have a new OS I wonder what is it based on and why?
> It has taken decades to make linux stable and relatively bug free
This is due to the development practices and tools employed, not intrinsic to the construction of new operating systems. There are much better tools now. For instance, Google could have built this on seL4, a verified microkernel instead building their own from scratch and they would have hit the ground running instead of the slower build up they're now going to face.
Google clearly has a LOT of cash to play with. They could have refactored/re-used the permissively licensed BSD core, but ended up re-inventing the wheel, breaking a TON of POSIX based software.
The "wheel" is 40 years old. It's a great wheel, it's doing a hell of a job. But with 40 years of perspective, maybe it actually can be re-invented better without carrying around a pile of legacy back-compatibility needs.
Considering that they could plausibly have it shipping on tens of millions of devices as soon as they get a browser working and hardware support for a well controlled subset of modern systems, I think that's a real possibility.
If it's so terrible why did microsoft develop the windows subsystem for linux? Shouldn't they instead try to avoid the "pile of legacy" as much as possible?
Because they saw a market opening with UNIX devs no longer happy with the hardware selection for using macOS as a pretty UNIX.
Also their goal is not to run 100% of POSIX or Linux specific software, rather achieve a good enough compatibility to run majority of well known projects and utilities.
1. MS is all about "piles of legacy." That is not counter to their philosophy at all.
2. MS apparently sees some long-term advantage to having a Linux compatibility layer. (By "advantage," I mean of course some way to make more money.)
Neither of these things has any bearing on the quality of Linux, or lack thereof.
Well they didn't replace their NT kernel with WSL? So it's a bit of apples-to-oranges comparison. WSL is there to provide access to the existing, vast library of linux software that might not have been treated with win32 ports?
WSL is an attempt at getting developers to use Windows rather than OS X or Linux. That's a very different situation than coming up with the fundamentals for a new operating system.
Because developers who write software that is deployed to non-Microsoft cloud providers are writing code that runs on Linux, and WSL makes that a lot easier to do on Windows.
> The "wheel" is 40 years old. It's a great wheel, it's doing a hell of a job. But with 40 years of perspective, maybe it actually can be re-invented better without carrying around a pile of legacy back-compatibility needs.
Newer is rarely better. Besides, in computing, so many "new" things are just rehashes of old (and not so old) ideas.
I really wish we could get away from the "that software is old" meme in this industry.
Old software often had requirements or made design decisions based on assumptions that aren't true anymore. These can make it more difficult to deal with or extend. I agree that in general we should just use the old software that works. However big projects like fuscia can reap some long term benefits. In the end it's a cost/benefit analysis to decide.
Well, the context that shaped old software has a habit of re-appearing, so the design decisions/trades-off may have become relevant again at least once since. The first vectorized code I saw was a relic from a long-gone Cray 1.
One great advantage of old software that has been continuously developed by the same core of people is that when new features are added, they have the benefit of being informed by all the past failures and near-misses. And they are almost always features that are actually new.
New software, otoh, hasn't been through a 40-year shakedown cruise.
Of course software improves. The context was BSD, Posix, etc are based on 40 year old tech. Those systems got a lot right and it takes a hell of a lot of hubris to think just because it’s 40 years old it is somehow not good anymore.
Google invest heavily in securing computing generally, from creating the secure browser Chrome because the alternatives at the time were very insecure to a big paid team of security researchers (recent successes include discovering Spectre and Meltdown) to creating secure operating systems to possibly eventually replace Android and ChromeOS.
A lot of security-interested programmers think it easier to secure a small microkernel like Fuchsia than to secure the older generation of operating systems and the baggage they have acquired.
Google want everyone to trust the Internet. If consumers are scared to shop online, Google dies. This is why they put so much effort into securing it.
In NDK yes. Those aren't really as fully compliant as we'd wished (and differences in implementation are sometimes really annoying to deal with).
Nevertheless, vast majority of applications will never include NDK code and even if they will, they probably won't really talk to POSIX calls directly either. For the type of OS Android is, it doesn't really gain anything by being POSIX compliant far away from developer APIs. And Fuchsia, if anything, is trying to be an Android type OS.
Well - requiring basic frameworks to be re-written for your vanity-OS is the fastest path to irrelevance. Most libraries and frameworks (from curl to zlib/git) assume a POSIX interface. If Fuchsia doesn't care about these fundamental utilities being functional, then that's a flawed assumption regarding the developer ecosystem
Most people aren't running curl on the command line of their phone, and developers use java apis to make http calls. The convention is that the os, i.e. the company, provide the basic frameworks, which is exactly how iOS and android work. The POSIX ship has sailed, you write portable C or in a managed runtime, people rarely write to POSIX.
This adds a build file for the build system they use, and a couple of the config files that would normally be generated by the configure script but presumably the configure script doesn't support Fuchsia, along with an auto-generated source file containing the help output.
So it looks like they have had to make zero actual source changes to cURL; the POSIX compatibility layer and setting the appropriate defines in cURL's config files are sufficient.
The core of zlib shouldn't have any POSIX dependence; it just takes buffers of bytes and produces buffers of bytes.
It has some convenience functions for decoding files given a path or FD, but I'm sure Fuscia's limited POSIX compatibility layer will support that, basic file access is relatively easy to provide in a POSIX library wrapping Fuscia's file handling.
Likewise, I'd be willing to bet that curl and Git would mostly work on top of the limited POSIX compatibility layer; they might need a few tweaks, but likely not much more than the tweaks needed in porting to the various BSDs and OS X, and likely a lot less than is needed to provide them natively on Windows.
Assuming Google is even targeting the developer ecosystem here. I'd assume that most of the users of Fuchsia will be end users, so that's what Google is targeting.
How much of the Android app ecosystem has a strong dependence on POSIX compatibility? I mean, yes, it is Linux under the hood, though it's not glibc, but most software is written against the SDK which abstracts over all of that.
Not being POSIX doesn't mean it's going to be entirely foreign or that it will have no compatibility with software that is already written for POSIX-like platforms. It just means that the basic process management, IPC mechanisms, permissions handling, and so on uses a model which is akin to a cleaned-up subset of POSIX; everything is done via handles, which are a lot like file descriptors but work in a somewhat more uniform way, there is cleaned up management of resource allocations to jobs, memory mapping of processes, and thread management, and so on.
It's not terribly hard to build a POSIX-like layer on top of this; said layer isn't necessarily going to support some of the real warts of POSIX, like the really broken way signals work in POSIX, so some software that intimately depends on this may have to be factored out to support the way that Fuscia handles signals, but for most software this will be a big improvement. Software on POSIX systems nowadays has to jump through hoops to make signals play well with event loops, frequently allocating a pipe that gets written to in a signal handler, so the event loop can pick up that notification later, while in Fuscia that's how signals already work, it's just a state change on a handle that can be waited for in Fuscia's equivalent of select/poll. So in cases like this, Fuscia will add a new code path that is simpler and more maintainable than the one on POSIX-like systems.
Yup, I'm wondering how many more days Allo has to live. I don't even bother with the proprietary Google messenger apps anymore because I know they'll be killed within a couple years or less.
Why would it break a ton of posix software? They seem to have ported the musl c library, they removed to have stuff like cmake vim git openssh running on it
So is Linux' code, except if you try to redistribute it by linking to it, you'd have to disclose your mods. Not so with Fuchsia's license. They can push out proprietary code all they want.
They certainly picked the correct license to do just that. GenodeOS has exactly what they needed on a formally proven kernel. Why didn't they use that? NIH syndrome... that's why. And of course, control.
I don't know why they down-voted you. The pitch, if any, for fuchia, wouldn't stand muster outside Google. Clearly, some folks benefit from "burning through a lot of cash", "spearheading the effort", and "padding their resume(s)"
I don't know if you see the irony in your arguments. You demonize capitalism in another comment to me, and argue here that another take that requires a LOT of funding is a "nice to have".
Do you realize that this "nice to have" is made possible by the deep coffers of Google, whose existence is owed to capitalism?
Whether its put together by Google, or the community, or whoever, really doesn't matter. Yes in this case it's Google, who does exist due to capitalism (not exclusively due to capitalism, though, as other systems work in other countries), but then capitalism is not the only economic or political system that produces 'deep coffers'. And this 'nice to have' is not build-able exclusively by Google.
By your logic anyone born to a family where the parents have U.S. employment would owe their existence to capitalism. Does that instinctively seem correct?
It is hard to imagine this kernel being anywhere near as efficient as Linux. What makes Chromebooks so great is peppy performance on cheap hardware that you just could never achieve with Windows.
Linux isn't particularly efficient... That it's often better than Windows, doesn't make it the most efficient. And does that matter? ChromeOS' UI is driven by Chrome (a web browser).
Wikipedia article [1] mentions: "It is distributed as free and open-source software under a mix of software licenses, including BSD 3 clause, MIT, and Apache 2.0."
Please also check Zircon's kernel license [2]. The PATENTS thing doesn't look though.
Or did you meant that it doesn't have a GNU license with a clause that forces others to distribute the source?
Probably it doesn't matter that much to them as by being the developers of the software they can make changes in their software and just not share it.
Anyway, I'm not happy with more Googlification and them having the "expertise monopoly" in yet another piece of tech.
Things like ChromeOS and Chromecast shows that they're not willing to create an open OS.
Why are you conflating a GPL-compatible license with GPL-licensed software?
If Fuchsia shipped on phones in binary form, consumers receiving those phones would have no legal basis for requesting the source code of precisely what they received.
GPL-compatibility only speaks to the mixing of code having one license with GPL code.
I certainly confused GPL with open source. Sorry about that. I also agree that it would suck phones not having to share the source. Anyway, the current state of Android is far from ideal with most of drivers being closed blobs.
Of course a non GPL license would put us in a worse position but the current situations also has limitations.
I wonder if GNUv3 could solve the blobs drivers issue.
WHY should everything be GPL-ed? Capitalism is an ethical, meritorious system rewarding hard-work and skill and GPL runs counter to everything capitalism stands for.
Not to mention that the fat profits capitalism bestows on Google FUNDS fuchsia
> Capitalism is an ethical, meritorious system rewarding hard-work and skill
Did you drop an /s or something?
Captialism doesn't give a shit about eithics or merit either way. The only thing worth a damn is capital, which is a pretty good motivator for all kinds of unethical and corrupt practices.
Not that GPL would necessarily be a good thing in this case.
> GPL runs counter to everything capitalism stands for.
That's the point (or at least one of them). Though I personally don't care if it is GPL or not, as someone interested in perhaps using this OS for commercial purposes at some point it's not being GPL is a slight plus for me.
However, one key point of the GPL is to create a commons (of software, hardware, protocols, etc.) that can't be exploited for gain by capitalists, that can't be exploited for economic rent, that can't be laden down with copyrights and patents. A system that is actually, truly free (as in freedom). Which is important to understand if you want to be intellectually honest.
> Capitalism is an ethical, meritorious system rewarding hard-work
Um... no it isn't. Meritorious only for those on top. While I agree that Google doesn't need to GPL Fuschia, GPL was imagined as a direct response capitalism's very pro-owner and anti-customer approach to distribution and rightsholding.
How so... It just prevents you from co-opting someone else's GPL software without contributing back. For that matter, I'm a pretty big fan of LGPL for libraries.
Nothing forces anyone from using someone else's GPL software. You aren't prevented from using GPL software either. In fact, it has a price, and that price is you contributing changes and directly interfaced software back into GPL. The price isn't money.
Beyond this, open-source/floss is meant to create a commons in terms of libraries and software that aren't commercially encumbered. Way too many things (particularly drivers) are locked away and stop working.
Your statement is entirely true but is non-sequitur.
Capitalism is not "the economic system in which things are bought and sold with money", it's a much more particular subset of all those types of economies (namely, capitalism is the type where the primary resources and means of production are privately owned).
It's incidentally TRUE that GPL is not anti-capitalist. But it's also true that you can consistently both sell things and be anti-capitalist. Selling things happens in most economic systems, capitalist or not.
> Capitalism is not "the economic system in which things are bought and sold with money"
Well, at least according to Marx (c.f. Capital) capitalism is the system where things are bought and sold with money. If you read first few chapters of Capital you'll see that Marx explains this in various ways such as characterizing capitalism with division of labour where instead of producing bunch of commodities, workers produce one commodity and sell it for money with which they can buy other commodities. And according to, again, his definition communism is the society after this type of commodity production, i.e. the mechanism that produces value (money) is stopped in the sense that people stopped exchanging things for money.
You may not care of Marx's definition, but I just wanted to note for completeness.
Indeed, I don't defer to Marx. But that said, it's been a while since I read him much. I suspect you're conflating Marx describing Capitalism with defining it.
We can describe the idea of buying and selling with money as certainly being a characteristic of Capitalism without asserting that it's a defining characteristic that is absent in other systems.
To be blunt, buying and selling with money is FAR older than Capitalism.
I'm really sorry that I will not be able to provide any source to you at the moment because I'm very busy, but for what it's worth I'll write you what I remember from my readings (I studied Marxism extensively for a time out of interest but I never had a formal education in sociology, so I'm not an expert, I'm a regular software engineer). Also disclaimer, I tend to agree with Marx on a lot of issues so my ideas might be biased. My terminology is also a bit rusty.
Capital starts by explaining commodities. This is because: (1) Marx tries to explain some bit of the terminology of his period's economic terminology so he needs to do some ground work; (2) commodity production is an important aspect of capitalism that he refers all throughout his works. My main point is that the force that made capitalism possible and the force that sustains capitalism is one, which is the accumulation of value. As Marx explains in later chapters, a things will have different types of values. For Marx, nothing has any intrinsic value and its money-value is determined at the moment of trade. That is the force that generates value inside economy is the act of selling commodities. The same force causes the distinction between bourgeousie and prolateriat (c.f. Marx's definitions of social classes) and the same force caused the transformation from earlier economic system to capitalism (which answers your complaint). Now this brings us to the end of capitalism, which Marx very insistantly argues that end of capitalism is the seizure of value generation which is equivalent to saying society being moneyless. E.g. one misconception people have is that Marx was also against labor-vouchers, but this is not true, as explained in Critique of Gotha's Programme, since labor-vouchers do not generate/accumulate value, their value is not determined at the moment of trade. Anyway, this also relates to Marxist criticism of anarchism. For anarchism capitalism --> communism is the seizure of capitalist state. But Marxism thinks this is fundamentally wrong because capitalist state is generated by capitalist mode of production. So, you will want to eliminate the capitalist mode of production instead of the state itself, because as long as c.m.p exists there is no way to kill capitalism, so state will revive. For Marxism you first need to eliminate the economic system that makes capitalism possible, i.e. accumulation of value, and then ultimately kill the State and class society as they're caused by capitalism.
I don't have a problem with Google's resources to build something of that magnitude successfully.
My problem is with their execution.
All their products are in perpetual beta.
And the users are forever testers.
That's their business model, and it doesn't call for great UI/UX.
"Senator, while I agree in the general sense that Fuchsia is not Linux, It appears that in this specific case, its just Yet Another Linux."
How is this not the latest iteration of not invented here syndrome? Any system like Linux or Python that has the "Benevolent Dictator For Life" holding the reins is inherently saying that they favor quality over quantity. Its almost like the US Senate, in that the very goal is to go only as fast as prudent.
I really don't get this coming up with a new OS's thing every now and then.
As I see it, is all about driver support; just because is the bigger effort. That is why vendors (and the community) focus in only one or two options (Windows/Linux).
Anybody can come up with new fancy OSs, as a matter of fact many people does. Problem is, there is no incentive for vendors to produce specific drivers for those, and the communities are just to small to cope with the huge amount of hardware support to make them useful.
I just don't see the point of coming up with new OSs as long as Windows/Linux just work as intended.
It's not clear where exactly Google intends Fuchsia to run, but wide hardware support would be best for a relatively open ecosystem like PC hardware. A non-general-purpose OS, or at least one designed to run on specific hardware, doesn't have the same requirement for a large body of drivers to be available. Imagine if they want it as an OS for a range of embedded devices, or something. Hardware support could be very limited. Especially at a smaller scale like that, Google's got the money and expertise to license hardware IP from another company and build their own drivers for it.
I do like the minimization that a lot of OS/Systems are undergoing though. I still remember when VMWare VM's came along and my mind was blown. Similar feeling on seeing Docker, but it was tempered somewhat until kubernetes came out. Very excited to see what comes next in the future.
> Calls which have no limitations, of which there are only a very few, for example zx_clock_get() and zx_nanosleep() may be called by any thread.
Having the clock be an ambient authority leaves the system open to easy timing attacks via implicit covert channels. I'm glad these kinds of timing attacks have gotten more attention with Spectre and Meltdown. Capability security folks have been pointing these out for decades.
> Calls which create new Objects but do not take a Handle, such as zx_event_create() and zx_channel_create(). Access to these (and limitations upon them) is controlled by the Job in which the calling Process is contained.
I'm hesitant to endorse any system calls with ambient authority, even if it's scoped by context like these. It's far too easy to introduce subtle vulnerabilities. For instance, these calls seem to permit a Confused Deputy attack as long as two processes are running in the same Job.
Other notes on the kernel:
* The focus on handles overall is good though. Some capability security lessons have finally seeped into common knowledge!
* I'm not sure why they went with C++. You shouldn't need dispatching or template metaprogramming in a microkernel, as code reuse is minimal since all primitives are supposed to be orthogonal to each other. That's the whole point of a microkernel. Shapiro learned this from building the the early versions of EROS in C++, then switching to C. C also has modelling and formal analysis tools, like Frama-C.
* I don't see any reification of scheduling as a handle or an object. Perhaps they haven't gotten that far.
Looks like they'll also support private namespacing ala Plan 9, which is great. I hope we can get a robust OS to replace existing antiquated systems with Google's resources. This looks like a good start.