Hacker News new | past | comments | ask | show | jobs | submit login

Read the section "Conclusion" at the end, if nothing else. It is extremely well written. The author throws out some potentially controversial ideas, but he (like many people I have seen) are all hoping for a common outcome - dramatically "rethinking" the concept of the OS and programming language, but really embracing ideas that are as old as time. I'll highlight one section:

"Most of all, let’s rethink the received wisdom that you should teach your computer to do things in a programming language and run the resulting program on an operating system. A righteous operating system should be a programming language. And for goodness’ sake, let’s not use the entire network stack just to talk to another process on the same machine which is responsible for managing a database using the filesystem stack. At least let’s use shared memory (with transactional semantics, naturally – which Intel’s latest CPUs support in hardware). But if we believe in the future – if we believe in ourselves – let’s dare to ask why, anyway, does the operating system give you this “filesystem” thing that’s no good as a database and expect you to just accept that “stuff on computers goes in folders, lah”? Any decent software environment ought to have a fully featured database, built in, and no need for a 'filesystem'."




What programming language should the OS be? That's the problem. What he's saying will never happen, because nobody can design a programming language that will fit all use cases. Unix is polyglot by design; that's a feature and not a bug.

Everybody wants the OS to be easier for THEIR use case, underestimating the diversity of computing. "Why can't I just have a whole OS in node.js, e.g. https://node-os.com/ ? That's all I need."

Well some people need to run linear algebra on computers and don't have any need for that stuff. Likewise you don't have any need for Fortran.

The "language as OS" thing has been tried with Lisp and Smalltalk, and failed to gain adoption for good reason. They are both just languages now.

Microsoft already tried and abandoned the second idea (WinFS). File systems are complex, but databases are an order of magnitude more complex. OSes evolve on a slower time scale; databases and languages on a relatively faster time scale.

Multiple databases are also a feature. Both sqlite and Postgres/MySQL work on top of the file system. The file system supports the minimum API you need to write a database. (It's somewhat bad, but the way to fix the API isn't to replace it with a database)

If you had bundled a a relational database with the OS, then you wouldn't have had the right abstractions to make distributed databases like Dynamo and BigTable and whatnot. A database just has more design parameters than a file system and so you can't write one that will fit all applications. Again, the problem is underestimating the diversity of use cases.

So yeah I think both of these ideas are badly mistaken.

EDIT: I also don't think idea about shared memory is good idea. You can can have IPC without the network stack using Unix domain sockets, and that's what people actually use to connect to databases on the same machine. It has the benefit that you can move the database to another machine with minimal changes.

A shared memory interface between and application and a database sounds like a bad idea.


Which leads us to ask, "Where are the revolutionaries?"

It used to be the younger generation, around college-age, would rethink their parents' systems: political, business, etc. These kids would look at the buildings around them and imagine what could be accomplished in razing those buildings and creating a new foundation. I don't see that anymore. I see college-aged kids simply wanting to add a new floor on top of a very rickety shack that should have been discarded ages ago.

Ironically, the people writing these types of manifestos for Computer Science are almost exclusively older. They often wrote that first foundational layer and know that it needs to be overhauled.

By the way, I would phrase it more that "any righteous programming language is also an operating system," and that "no righteous operating system operates on files, so no righteous programming language is coded in files." Data that is not stored or accessed as data eventually will be.

It's true we've tried many variations on these themes in the past, but if we do the logical syllogism in our heads, we come invariably to the conclusion this is how the future should be. We know these things. So why give up? What does failure mean other than that we need to continue trying?

It infuriates me that we don't try -- when people say "we've tried everything and there's no way forward." No, we're just lazy. Charles Duell never actually said in 1899 that "everything has already been invented," yet, oh my God, we can't stop saying it now. So I'm going to ask again, "Where are the revolutionaries?"


It absolutely does need to be overhauled, but the problem is that people don't agree on the direction. I think most of your ideas are profoundly mistaken, and have been disproven by history. I'm sure you think my ideas are wrong too.

I think that people tend to dismiss the success or failure of certain systems as luck or even marketing, rather than deeply reflecting on the intrinsic properties of those systems that led them to be adopted or not adopted.

Unix and the web have lots of flaws, but they're not an accident of history. They have fundamental architectural properties that made them succeed. (In particular, REST is the same idea architecturally as Unix/Plan 9 style file systems. Long story, because it's basically avoiding O(m*n) problems in the ecosystem.)

You also have to accept evolution as a fundamental force in computing (as Linus Torvalds does). To ignore it is to set yourself up for failure and frustration. After all, humans are by no means the optimal living being either, but they're the best that evolution came up with.

And as I said in my other post, "revolutionary" ideas tend to just add to the pile. They fail to meet their expectations and then Unix subsumes them. It's almost an economic inevitability.

And Linux WAS revolutionary, if you compare to to Microsoft. The first couple decades of my life were DOS and Windows based, so from that perspective the revolution happened. Microsoft is porting all their stuff to Linux now.

http://www.revolution-os.com/

If you want a single language OS, check out Mirage: https://mirage.io/ . I know somebody is going to say "I don't like OCaml". Now you understand why Unix dominates: it supports heterogeneity. Even Mirage can't really get rid of Unix, because it runs on Xen, which typically runs on Linux.


I recognize that we disagree and I respect your opinion, I would just like to add a bit more:

> "have been disproven by history."

I guess that's my point. I don't think we've had enough history to be conclusive yet. I keep hearing this, over and over, and it's the equivalent of "heavier than air objects can't fly -- it's been proven in attempt after attempt."

In my opinion, the reason why there are thousands of programming languages, for example, and is not because we can't agree on them; it's because it hasn't been solved yet. No language is good enough, so we keep trying. In the late 1990's there were seemingly hundreds of search engines. Google/Page Rank came along and all the other search engines went away almost overnight.

> "revolutionary" ideas tend to just add to the pile

They do until they don't. All those crazy attempts at flight were truly crazy, until they lead to flight.

We're at the beginning of history, not the end of it.


Search engines are comparable in terms of how successful they are at their unique task.

Languages? What metric are you going to use that equally applicable to all tasks?

Think of it as vehicles and their different purposes. Would you support condensing all vehicles to one, so that one vehicle has to both haul and mix concrete and take your kids to get ice cream?


I think systems programming would continue to be separate. I left it off because it amounts to a small fraction of in terms of usage in terms of total code written (and I consider it to be able to be grouped in with hardware frankly).

The rest is application programming, and yes, when it's right, it will absolutely consolidate: There will be one "language". Sure, the current conventional wisdom is that languages are the "different tools in the toolbox." I know. My statement wasn't borne out of ignorance. What I'm saying is that kind of thinking is what's holding us back.

What is a program? Is it expressive text? Is it, "Shall I compare thee to a summer's day?" No, of course not. There's no narrative. We use text flow for instruction flow but that's coincidence. We are acutely aware that programs are discrete instructions. Code is data.

I made a statement above. I'll repeat it because it's important. It's a law: Data that is not stored or accessed as data eventually will be. Any programming language that uses text files will eventually go away. There will be one "language" eventually because the natural inclination of data is one representation. There will always be a market for new widgets to manipulate data or find insight in data, of course, but there will be one language -- and it will be data. It's the law.

Now I should admit, it's not that we haven't tried. We've tried many times and failed. What I'm saying is that if we come to a conclusion for a destination then there is no other choice but to make that the target.


I'm not fully grokking the theory/philosophy you're trying to express here, and I'd really like to, so I'll ask for forgiveness in advance if I seem too dense.

When I think of language and getting a message across (be it to someone else, or to a computer as code), I think of the components that are required for communication:

  - Emitter (myself) (EM; mostly obvious)
  - Recipient (someone else/the computer) (RE; also obvious)
  - Message (what I want to communicate) (MS)
  - Medium (what carries the message) (MD)
  - Protocol (base rules that both sides agree on) (PR)
If I'm trying to help my mother-in-law with a computer problem, I'll tell her to click on such-and-such over phone lines using English. (MD=phone line; MS=instructions; PR=English grammar)

If I'm writing a script to unzip a bunch of files at once and distribute the contents into different folders based on the file type, I'd probably whip up a Python script. (MD=text file; MS=task description; PR=Python)

My script in turn will communicate with the file system and the operating system to complete my task. (MD=bits; MS=instructions; PR=Python-to-Machine-Language-Interpreter)

If I understand you correctly, you're saying that all application programs should distill down to just the message, as the one representation of what I want to communicate. I'm hesitant to accept that, since it tends to assume that both emitter and recipient can independently determine and adapt to the correct protocol when all they have is the message.

For example, my Python script will probably use the file extension as a heuristic for the file type, but will make an error when picture.txt actually is a badly named JPEG. If I want to increase correctness, I can change the script so that it uses the file's Magic Number, but that may be overhead I don't need if the files are correctly named. Choosing one technique over the other requires that I craft the message differently, because Python doesn't make a decision on how best to determine file type.

TL;DR: Data needs a protocol to be used as information, and languages provide that. Doing away with languages means we need to define and agree on a non-ambiguous way of reading data, which is a tall order.


You raise an interesting point. Data, as we use it today, is often bare-bones and without meta-data. How you interpret it is very much up to you. It's as if, in English, we said "... apple ... tree" and then leave it to the user to decide if that's "I'm planting an apple tree" or "You can't get an apple from a walnut tree."

You're definitely describing how we work in today's terms -- there's nothing wrong with that, I'm just challenging it. I'm saying that we can describe the protocol better using data; or more I'm saying that it is data and that any other way of describing it is sub-optimal and for that reason will eventually desist.

> Doing away with languages means we need to define and agree on a non-ambiguous way of reading data Actually I'm saying that language is more ambiguous than data -- at all times and in all cases. Anything you can get language to do, you can get data to do better (in this context). But it's not a hard case to prove because actually we're using language as data. We're doing the equivalent in programming languages of typing in "three hundred" into a text file and having it read from the text file and converted to a single data unit of 300.0 when we could just be operating directly with the data.

But I didn't clarify how this would happen, so I can see that it would seem a bit abstruse. I need to better communicate and clarify that we need to expand how we're viewing data and how we work with data -- that we need to incorporate more meta-data as an ancillary part of the data that both comes with it and yet is still secondary.


Great reply David and my sentiments match exactly. BTW took a peek at your profile and it led me to Kayia, and this bit of text:

"No instruction is stored in its textual form, but instead it is interpreted as you type and stored directly in its AST form and exposed in the same way as the data. This allows you to query code identically as you would query data, and look at it in various aspects and layers."

This is interesting to me because I designed something identical about 10 years ago (and who knows who else has) when I was designing my first proprietary compiler. I wanted to be able to manipulate tokens with the same ease that you might in a spreadsheet (which was probably an arbitrary goal but seemed cool at the time).


I'm so happy to hear that. I'll reach out to you and let you know what's coming next for Kayia.


Sounds good - gavanw@gmail.com


Clearly TempleOS/HolyC is the answer we've been looking for.

http://www.templeos.org


The Smalltalk and LISP methods were successful in the commercial space. Still are at software level with LISP machines doing it at hardware level with incredible benefits for a while. Reasons for failure seem social or economic instead of technical so far. I'm not dismissing them for customizable OS's yet as we might see a comeback of such things in new form.


Some of the technical problems of Lisp Machines:

  * relatively complex code
  * grown system with lots of legacy code
  * depended on certain hardware or its emulation -> not portable to other architectures
  * almost no security story
  * optimized for single-user GUI workstation use, other types (headless servers, ...) not so
    well supported
  * not multi-user capable
  * weak support for terminals
  * would have needed more bug fixing
  * more difficult to use with standard keyboards
etc etc


Was the lisp machine OS fundamentally non-portable to other hardware? It's my impression that it could have been, with similar amount of effort to making the first port of any OS.

Unix is intended to be portable, but creating and maintaining a port (including compilers and drivers) is a large effort, and somehow a community never materialized around supporting Lisp Machine on PCs, the way Unix on PCs did.


The MIT Lisp Machine had been developed further and ported to a bunch of CPUs. LMI and Symbolics started with the original CPU. Symbolics then developed a 36bit machine and later a 40bit machine. TI developed 32bit machines. LMI and Symbolics were working on a new generation, which did not reach the market.

All these released CPUs were basically stack-based architectures with a bunch of Lisp support and even some esoteric stuff. Early CPUs had writeable microcode, so that special instruction sets could be developed and improved.

The main compiler/runtime was never (AFAIK) ported to support conventional CPUs of CISC or RISC types with mostly fixed instruction sets. Symbolics seems to have been working on a portable Common Lisp (for Unix etc) for a short period of time, but I have not heard of a portable OS on 'conventional' CISC/RISC hardware. Symbolics developed an embedded version of their OS, but still for Symbolics hardware.

I can't remember that that any of the competitors were developing a Lisp OS on top of something like SUN/IBM/Apollo/SGI/DEC/... hardware. Xerox had their Lisp OS ported to SUNs as emulation and you would run it on top of some SUN OS / Solaris. Symbolics ported theirs as emulation on top of DEC ALPHA / Unix.

For companies like SUN, DEC, IBM, SGI, etc. it was possible to license some core OS and develop from there. But there was no portable core Lisp OS to license. One could license the MIT Lisp OS, but the code you got from them was for a special Lisp hardware.

Symbolics and TI were able to use some standard chips for some interface functions in some of their systems. It's not just the CPU, which needs to be supported, but also the hardware for serial interfaces, ethernet, graphics, disks, wireless, ...


That they ended up emulating Genera instead of porting it should tell you something. Maybe. Also, look at the trouble people go through to do that:

http://fare.tunes.org/LispM.html

Then there's this that makes it look easy but also shows the keyboard problem LispM was probably referring to:

http://www.loomcom.com/genera/genera-install.html

What I find on getting them running without LISP hardware is somewhere between tough and "wow, I feel for that person." Maybe things have improved. There just seems to be a huge mismatch between many aspects of LISP machines and modern machines that means it's probably easier to do a clean-slate LISP machine that builds on modern primitives & interfaces better.


Stallman was a noted Lisp hacker, yet chose to develop a C compiler and clone the Unix user space for his free software project. Stallman killed Lisp; yeah, that's it. ;)

I mention that because there is Unix on PC's that isn't free, and heavily indebted to GNU.

Proprietary Unix on PC hardware is an insignificant blip in computing history.


A writeup on Xenix indicated it had huge impact in getting UNIX into more universities and created tons of market demand for PC UNIX. If that's true, then it's not so much an insignificant blip as a huge part of the reason for FOSS UNIX's success if they benefited from contributors from those universities or demand they generated.

http://www.softpanorama.org/People/Torvalds/Finland_period/x...

Then it becomes a relic of history with lasting influence from there. Unless you count the Linux distro's people paid for. Two of those are still leading with a more, usable one mostly happening due to paid, support model. Seems like proprietary UNIX on PC's just shed its skin and took a new form that dominates UNIX on PC's to this day albeit with more benefit to users. ;)


I was thinking in terms of the ability to do the OS with LISP with some of its advantages in development, maintenance, or customization per user. Except done on modern systems in ways users might actually use.

I still thank you for the response since I like learning about LISP machines and reasons for failure. This is a nice list of stuff to avoid in the next one where applicable.


> Unix is polyglot by design; that's a feature and not a bug.

No it is not. Unix was always built around C with other languages as an afterthought, even late in its development. Here are some quotes from my copy of the UNIX time-sharing system: UNIX Programmer's Manual revised edition from 1983:

"System calls are entries into the UNIX supervisor. Every system call has one or more C language interfaces..."

"An assortment of subroutines is available... The functions are described in terms of C, but most will work with Fortran as well."

"The three principal languages in UNIX are provided by the C compiler cc(1), the Fortran compiler f77(1), and the assembler as(1)."

Saying that Unix is a "polyglot" operating system makes as little sense as saying that Symbolics machines were "polyglot." Symbolics had better C and Fortran development environments than Unix but it does not change the fact that it was a Lisp machine. Unix is a time-sharing system with a C standard library, C memory management conventions, C calling conventions, C stack layout, C process memory layout, C linking and loading. Working against any of these conventions is possible but very awkward.


Yes, you're right that C is special in Unix. There's a bias toward writing applications in C because the system interface is provided as C headers, and the ABI is architecture-dependent.

But C turned out to be a great language for writing programming languages as well as kernels. The JVM is written in C, Python/Perl/Ruby/v8 etc. are written in C or C++. So in practice you do get an ecosystem with multiple languages. It's certainly nicer to write languages in C than in assembly.

Treating everything as byte streams is another big reason that it is polyglot. This has obvious downsides, but if you stored everything as s-expressions, or C structs, then it would privilege one language over another. Traditional Unix utilities don't use architecture-dependent binary formats, and this is one of the main reasons why.

But honestly, the only other solution would have been to define an RPC-like architecture-independent interface -- what operating systems do that? I don't know of any.


> The JVM is written in C, Python/Perl/Ruby/v8 etc. are written in C or C++.

This is disingenuous - that's exactly like saying that ZetaC on the Lisp Machine was written in Lisp and not in microcode, so Symbolics is a polyglot machine. The reason that a lot of dynamic programming language runtimes (like the JVM, SBCL, etc) use C is exactly because the only alternative on Unix is to hand-code system calls in assembly. The code generators are almost always self-hosting in these languages.

> It's certainly nicer to write languages in C than in assembly.

C is a bad language to target compilers and transpilers to if you need non-standard control flow because it forces you into a stack discipline. Doing things like continuations or restartable exceptions requires Rube Goldberg machine-level workarounds.

> Treating everything as byte streams is another big reason that it is polyglot. This has obvious downsides, but if you stored everything as s-expressions, or C structs, then it would privilege one language over another. Traditional Unix utilities don't use architecture-dependent binary formats, and this is one of the main reasons why.

Even the idea of byte streams is a C-ism. Bytes on the PDP-10 would "naturally" come out to 6 bits (out of a 36-bit word) - the instruction set was based around flexible bit fields.

> But honestly, the only other solution would have been to define an RPC-like architecture-independent interface -- what operating systems do that? I don't know of any.

That sounds like a microkernel. But I think the real takeaway here is that any operating system is going to come with baggage for language implementors, and it is important to recognize and make explicit what these assumptions are and why they were made (like for example I don't think many people ever think about the non-8-bit-bytes thing, but it is important if you want to provide a nice interface for bit-banging: http://clhs.lisp.se/Body/f_ldb.htm#ldb)


C doesn't have calling conventions; those are dictated by ABI's. A Pascal compiler can easily use the same ABI as a C compiler.

The layout of a process doesn't really do anything that helps C.

The C-specific considerations creep into API's when clients are required to prepare, or to parse, memory described as a C structure.

C helps here by dictating that the members of a struct may not be reordered. The rules that compilers use for aligning structure members tend to be very similar and straightforward.


I agree with all of your points, I just wonder (hypothetically) if things could be improved, or are we at the peak already? I think certain things, like the web browser, could be made much better if we rethought it and simplified it. Of course, we are fighting a huge battle against legacy, but big things have been killed in the past (see Flash)


I agree about the web browser, but unfortunately there's a tradeoff between features and stability. A browser from 2016 has to do way more than a browser from 2006 or 1996. It's supporting much more of the world's interaction now, in a very literal sense. So we have gained a lot, but it comes at a cost.

Both Firefox and Chrome have had severe stability issues, and I hope that when this period of rapid web evolution subsides, there will be a rethinking and consolidation.

I believe Douglas Crockford said that the period of stasis after IE6 and before Firefox was the one of best things for JavaScript, and I think there is something to that argument.

Although I would ask you -- improved along what dimension? Everything is a tradeoff.

I agree that software is much too buggy and slow. As far as being "bloated" and complex, I used to be part of the church of simplicity, but I've come to realize that the use cases for computing are just very diverse. Though I would like to get all these features without the associated instability and performance drop (i.e. paying for what you don't use).

Bjarne Stroustrop expresses this well; he says he hears this all the time: "C++ should be a much smaller and simpler language. Please add this tiny feature I'm missing". I think there's a lot of truth to that.

I'm working on a shell [1] because it's a language that connects other languages. It's the glue that makes some of the complexity manageable. And it's also the language closest to the OS other than C -- you might call it the "second language" that boots up on any system.

Shell is of a horrible legacy mess too. It's an instance of failure by success, much like PHP.

I think the reason for the mess is that piling new crap on top is easier than fixing and replacing old stuff. It's much easier to write a new language than to replace an old language, and probably more fun. A lot of my blog is basically software archaeology.

If people want things to be better, they have to roll up their sleeves and dig into the systems that people ACTUALLY USE and fix them, rather than just proposing "revolutionary stuff" that only adds to the pile.

[1]: http://www.oilshell.org/blog/


Everything I've seen about revolutionizing software points to the likelihood of it succeeding being zero. There's a lifecycle to it, and what we preserve are data and protocols, but not the incidental complexity of the systems themselves. At every point of the continuum, computer technology is more complex than it has to be in an academic sense("why have a computer at all if you can do equations in your head?"), but it solves enough problems that it stays alive.

Or to see it from a different light: once it's programmable, you've doomed it to die, the more so the more programmable it is. And this is borne out by how fast we burn through hardware. Our code survives the best where it's more driven towards a known destination format - e.g. an old TeX source document is more likely to be rebuildable into a rendered artifact than equivalent C code into usable software.

The Lisp or Smalltalk attitude to this - which is to remove code/data boundaries altogether - mostly seems to add further uncertainty and less room for curated archival. Either the whole thing runs or it doesn't.


Good points - see my comment below ("With regards to standing on the shoulders of giants...").

I wonder if there is some sort of enforcement of rules you could apply that would result in a programmable thing maintaining a relatively clean / non-bloated state. Of course, once you introduce more rules, you reduce freedom...


These are good comments.

>A righteous operating system should be a programming language.

Like the Commodore 64, VIC 20, TRS 80 and many early home computers.

>Any decent software environment ought to have a fully featured database, built in, and no need for a 'filesystem'.

Microsoft tried to have MSSQL as a file system in Windows called WinFS.

https://en.wikipedia.org/wiki/WinFS

If he's referring to non-relational, then most FS as we know it are key/value databases. You put in a key: c:\files\myfile.txt (/usr/root/myfile) and you get a byte array. It's more sophisticated than simple numeric tokens because it adds a 3d structure in the key.

Part of the reason for these seemingly odd things is we stand on the shoulders of giants that came before us. Because of this, iterative technologies must be backwards compatible to survive.


With regards to standing on the shoulders of giants, the rational part of me wholeheartedly agrees, but the kid in me just wants to burn it all to the ground and rebuild it from scratch. :) Of course, it is nearly impossible to rewrite things better because inevitably you hit one of these traps: not enough time/money, the bureaucracy that is attached to money, the "too many chefs" problem, or the inadequate supply of experienced/battle-hardened minds to put on such a project - i.e. people that have written compilers, operating systems, GUIs, etc - enough times over to know what they are doing.


Whereas the strategy worked for AS/400's. Helped them be more consistent, reliable, and easily managed. I'm more for pluggable storage with filesystems, object systems, and RDBMS's being options to choose from. Microsoft's failure doesn't invalidate the concept, though, so much as their use-case and solution didn't work out.


A database as a file system is fundamentally a good idea. But so much software expects a regular file system, and it's surprisingly hard to implement an efficient regular file system on top of a relational DB.


I'm all for making system calls and IPC cheaper. I'm also all for making system calls indistinguishable from IPC for all practical purposes. I'm in support of those ideas even if one does not take them to their logical destination at a microkernel.

But I just don't think filesystems should be as powerful as databases. At a minimum, no computer exists alone, and sharing database structures is a much more complex action than sharing opinionated byte streams.


I like the built-in database functionality he describes. Wasn't that a feature of the BeOS file system?


No idea, but VMS have something like that, and probably also other mini computers of the era.

I'm only 40 so I'm to young to know :-)

I think that Unix' "everything is a stream" concept won because it's simple and powerful, but it's a good example of the failure of our educational systems that only old geezers have a clue that there are alternatives.

I feel like 99 % of the things I do at work, some guy at IBM should have made a generic solution for in 1979. Actually someone probably did.


Probably not the earliest implementation, but IBM mainframes did indeed have record oriented files in the 1960's.

https://en.wikipedia.org/wiki/Data_set_(IBM_mainframe)


This is actually one of the main reasons why I'm interested in non-PC computers and non-Unixish/non-NTish operating systems :)


Maybe. You can query on BFS to match meta data entries. But for me it's more important that file systems be transactional. So instead of doing a stupid write-to-tmp-file and then move-into-place-on-same-physical-volume-hoping-it's-atomic dance you could just say open('filename').then_write(buf).then_commit() or whatever and it would do the work as a transaction; or fail. But you wouldn't end up with corrupt files with half written chunks.


Not sure, but it does sound interesting. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: