Hacker News new | past | comments | ask | show | jobs | submit login

We are having new languages every year. Instead of debating which language is the best, why can't we invent a way to let components implemented in different languages talk with each other easily? We have pipes, sockets and message queues, but it's never simple enough to glue everything together.



The point of languages is not what you can do, but what you cannot do---making mistakes impossible so you can trust your own, and especially other's work (libraries).

Combining languages is likely to circumnavigate those prohibitions, defeating the purposes of the thrown-together languages. Your polyglot tower of babel is likely to fail.

See https://news.ycombinator.com/item?id=11830958.

-----

Now I'd love to see a deep Rust<->Haskell bridge. But getting this right by not breaking the interesting invariants of either langauge is research-level work.


This is an interesting idea that has been explored before in VMS, an old operating system I believe competed with UNIX.

VMS had a feature called CLE (Common Language Enviornment) [1] which defined calling conventions for computing primitives (functions, registers, stacks...you get it) independent of any language. You could call bits of code from all sorts of languages like COBAL, FORTRAN, C, and some others I'm not really familiar with. Because the calling conventions were specifically designed for language interopperability in mind, VMS was implemented in several different languages. Different components were coded in whatever language best expressed them. This directly contrasts with Unix, which we all know champions C.

I'm not too familiar with Unix calling convention specifics, but as I understand, it revolves around C and its memory model. I believe this is what gives some languages difficulty "talking" with each other; if a language doesn't have a memory or execution model close to C's, it needs to translate through a FFI (Foreign Function Interface) [2] before exchanging execution routines efficiently.

[1](https://en.wikipedia.org/wiki/OpenVMS#Common_Language_Enviro...)

[2] (https://en.wikipedia.org/wiki/Foreign_function_interface)


I think Unix's pipes are better examples that how components could communicate. It's a pity that due to terminal limits, the best we can do about connecting components in Unix is to pipe things through.


Microsoft has sort of been working on this for decades. It started as Dynamic Data Exchange in Windows 3.x, then evolved into Object Linking and Embedding and Component Object Model, which in turn became the basis for ActiveX, Distributed COM, and COM+.

Reactions vary.

I believe GNOME was originally envisioned as providing a GNU framework for this kind of functionality (whence the "Object Model Environment" in the original acronym expansion), but I think those particular ambitions have mostly been abandoned.


The problem isn't that communication is hard, it's that semantics matter, we want different semantics between languages, and there is always difficulty bridging those gaps.

As a simple example, consider the JSON string

    {"a": 36893488147419103232}
That's 2 to the power of 65. Let's decode that in Python. What do I get?

    >>> import json
    >>> >>> json.loads('{"a": 36893488147419103232}')
    {'a': 36893488147419103232}
    >>> json.loads('{"a": 36893488147419103232}')["a"]
    36893488147419103232
Ok. Let's decode that in Go: https://play.golang.org/p/FXESipFeZI

    json: cannot unmarshal number 36893488147419103232 into Go value of type int64
What we're seeing here is a fundamental semantic difference between the languages and how they represent numbers. Go has a machine-level type representation that focuses on the memory being allocated. Python has a fundamental representation that implements arbitrary-precision numbers and can decode that without any fuss.

For bonus points, it's worth pointing out that while the JSON specification itself provides no limits on the size of numbers, it is generally unwise to use JSON numbers that can't be represented as IEEE 64-bit floats because there's a lot of languages that will get that wrong. I can also cause issues by sending large ints out of Python or Go into a language that expects floats to actually be floats, and then lossily decodes them in the process. The JSON spec theoretically doesn't have a problem here but the "real world" JSON spec is quite messy in this area.

My point here is not that these problems can't be solved. They all can be solved, at least on a case-by-case basis (two specific programs communicating with JSON). My point here is A: these problems exist B: these problems generally don't admit of practical generalized solutions (for instance, you'll find "programs must use BigInt libraries and programmers must understand all implications of that to use any JSON parser" isn't going to fly, and that would still be ignoring issues I could go on about for some paragraphs) and C: these issues are belligerent and numerous.

This is one minor issue in a corner of the JSON spec between two languages I happened to pick. This is not the exhaustive listing of such issues, this is merely one of thousands of examples you could construct between Python and Go alone. (In fact, it isn't even necessarily a mismatch between "the two languages" so much as "the two languages and the particular JSON parsing library", which means it's even worse than it sounds; I could rotate JSON libraries in either language and potentially get other issues!) Here's another thing that isn't really an issue so much as a meta-issue between all sorts of language pairings: How do you pass a value across different memory recovery types? That is, how do you pass a value from a GC'ed language and back to a non-GC-ed language? Bearing in mind that "GC'ed language" and "non-GC'ed" language are both themselves categories, and the details of both of those things matter a lot. Python and Go are both garbage collected, but you still can't pass values back and forth between them even if you jam them both into the same OS process!

You die the death of a thousand cuts trying to fix these issues between even two languages, then it gets worse if you try to pull more into the fold.

So what you end up with in practice is a protocol that is set at the OS level, writes into stone a whole lot of semantics that deeply, deeply affect the sort of code that can use them, and then those semantics bend the design of every language written on top of them. Right now, on Linux that language is C, Windows has C++ and a .Net runtime, and other people can pipe up with the base language of other OSes. So Linux has a ton of languages that, for all their glory and features and libraries, are ultimately just C with really, really pretty wrappers: Python, Perl, Lua, PHP, etc. Then these languages can communicate on the "C bus". The biggest exception I know of is Java, which is biggest runtime that become its own ecosystem without having an OS to go with it, so you get languages that are ultimately just Java (or JVM if you prefer) with really, really pretty wrappers: Scala, Groovy, Clojure, etc. This is the only solution that I'd say has every achieved any sort of scale, but you still get islands between the languages, the .Net island, the C island, the Java island, a lot of little islands from languages that have their own runtimes which are really cool but can't be expressed on "the C bus" very well, etc.

Personally I think one of UNIX's major problems right now is managing to escape from the "C bus". Back on the original topic, Rust is one of the most interesting stories I've seen in a while there; it can operate on that bus and even provide functionality, while using its type system to escape from the fundamental weaknesses that being on the "C bus" usually entails with memory unsafety, use of dangerous pointer semantics, etc. I think there's a distinct possibility Rust may be able to "bootstrap" us out of there in a way no other language has yet managed to.


Just as a nit, you can deserialize this number in Go just fine with big.Int: https://play.golang.org/p/s1EFgyXxl5.


As I said, all the problems can be solved on a case-by-case basis. But you can't just replace all numbers in JSON with big.Int, because you'll miss floats. You can't just use big.Float for all JSON decoding, because you'll trash performance, something that Go users care about more than Python users (because in Python you've already accepted that your code is going to be slow relative to C). You can solve each problem on a case-by-case basis, but to provide a general solution that makes everybody happy is impossible, which is what you want for this big ol' happy "let's just let everybody call any function they want" plan to work.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: