Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Rise and Fall of CORBA (2006) (acm.org)
67 points by ptx on Aug 31, 2015 | hide | past | favorite | 56 comments


As chief scientist at a start up company in the 90's I had to occasionally attend standardization meetings to protect our products. This included some of the very early CORBA meetings. Without participating, we risked being non-compiant with evolving standards in our space and so I had to attend and keep up with the directions things were going.

Companies like DEC, HP, and IBM might send five or six people to the important meetings and consequently it was easier for them to influence the direction of standards in a way that favored the architectures of their products. While I met a few, very bright, systems architects actually interested in coming up with a design that would provide the benefits of industry-wide standards, many of the participants couldn't grasp good design or simply acted in a partisan way for the sole good of their own company. What struck me was that there were mostly "goers" not "doers" attending these meetings. The organizations real developers and architects were busy working on real products, too busy to attend the excruciating standards meetings. The goers on the other hand might be from the "planning" organizations in these large companies. These planners often had very little background as developers and little insight into realistic requirements.

At the CORBA meetings, powerful members couldn't agree on the lowest levels of the protocol. Would it run over UDP? or TCP? what about IP vs Token Ring vs OSI networking architectures. At a slightly higher level some companies had their own, completely incompatible, RPC, data marshaling, and security efforts well under way and they simply wouldn't sign off on any standardization below the Application or Presentation layers of the protocol. This was crazy; CORBA was burdened with an architecture requiring standardization at the application layer between products that couldn't actually communicate with each other.

Hopelessly deadlocked at the most fundamental communication levels, the key people involved in CORBA needed progress to keep CORBA alive, so CORBA moved forward on standardization of things like nested distributed transactions--trying to run before they could walk.


This sentence shows a trap most technical people keep following into:

"These arguments cannot fully account for CORBA’s loss of popularity, however. After all, if the technology had been as compelling as was originally envisaged, it is unlikely that customers would have dropped it in favor of alternatives."

Wrong. Those arguments more than explain the failure.

CORBA, like most technology, doesn't succeed or fail because of how "compelling" or repulsive it was. People use terrible technology all the time. Major applications are written in terrible technology. Business make huge amounts of money off terrible technology. Great technology is skipped all the time.

As engineers we care about this kind of thing. The problem is that, the vast majority of people in a business don't give a shit what technology stack you use.

Technology does not exist in a bubble, where it lives or dies in some kind of hippy meritocracy. Technology exists in the real world where other factors tend to dominate far more than we think they should. This is also why when engineers start companies, they quickly learn that 90% of running a business has nothing to do with technology or development.


I agree with your points, but I think the paper was probably referring to actual problems in CORBA which were very obvious to everyone - technical and non-technical - and caused a lot of pushback.

For example, the original C++ bindings were a train wreck: All C++ types including strings got a second CORBA::.. definition, so you couldn't interoperate with other C++ code easily. You had to write pages of boilerplate to do even the simplest thing. There were lots of traps which could cause your program to crash or leak memory, causing troublesome bugs in production systems. This affected schedules (and hence marketing) and field support.

Another example was we couldn't get different supplier's ORBs to talk to each other - which is, like, the whole point of a standard communication broker (right?)


Since then, XML webservices have Risen and Fallen; JSON seems to have just peaked, with the complex, XS-like JSON schema seeming to finally gain traction.

The cynical common wisdom was that people happily adopt a new simpler technology, until they finally understand the problem... and to address it, the new technology must become just as complex as the old one.

The article's main point, about non-competitive standards-setting is a good one, but can only work for non-competitive technology. This is happening to some extent, as infrastructure is commoditized as a complement.


I agree that getting just the right amount of simplicity is actually a hard challenge. JSON over HTTP feels like it has won as it was so trivial to implement in any language, however experience has taught me that the lack of a schema causes problems over time.

At StackHut (www.stackhut.com) we're using JSON with a lightweight schema. So far this is working very well, providing simple yet 'typed' remote interfaces into containers. However I often wonder about switching to XML/XML-schema/XML-RPC, protocol buffers/gRPC or some custom system in the future or to just keep it simple and not over-complicate things.


I think what happened is that people building a complex system, who knew they needed schema etc would just use the XML/ws-* ecosystem, because as repugnant as some might feel it is, it is all written, debugged and works. If people didn't need complexity, they used JSON.

The XML option kept complexity out of JSON.

However, its inevitable people starting with simple systems (using JSON) would become complex - perhaps partly due to dramtic success. And converting everything to XML would be time-consuming and error-prone and... repugnant. So. This is where the demand for json-schema comes from. And it does seem inevitable: the inertia of back-compatibility is one of the more predictable features of software.

I can make the hopeful observation that things aren't quite as bad as the cynic take above: people do learn from some of the mistakes of previous technologies. There is some progress.

I dislike JSON-schema because it's like a JSON version of XML schema. I think a simpler scheam would better serve JSON and its typical uses.

Is your "lightweight schema" a simple schema written in json-schema? Or written in a lightweight schema language?


Sorry for the delay - was on a break from HN

Yes, I think you are right, the availability of XML/ws-* acted as a magnet for people who required extensive schemas, etc.

I agree that JSON is a great starting place for the rest of us who don't have such immediate needs for complexity and get can by with it. But I think eventually software growth then pushes them to more complex interchange format, e.g. JSON schema.

I think that there is progress, with JSON on the simpler side, and now newer formats like ProtoBufs, Thrift and mechanisms such as RPC - we do seem to be learning from the past. It does feel that perhaps we do swing from one extreme to another - first RPC was great and CORBA came and went, following this the perception was that it was utterly unsuitable for anything, until perhaps the introduction of ProtoBufs, Thift, JSON-RPC and so on. I personally think it can be incredibly useful, but deciding on just the right features to keep things manageable is incredibly difficult (more so than the tech itself I believe).

We're no fans of JSON-Schema either, I've thought about it a few times but it feels over-complicated. Instead we've settled/forked a criminally overlooked system called Barrister RPC (http://barrister.bitmechanic.com/). This just supports basic JSON types, structs created from their aggregate, and optional nullability. It has worked great so far, although we may expand shortly to add more numeric types. You can try it live at http://www.stackhut.com (source at http://www.github.com/StackHut) - would love to hear your thoughts re the schema/RPC layer.


(this is my third reply) Because researched stackhut (HN, quora, your website), I feel compelled to offer (unsolicited) advice: I couldn't tell what your selling point was - as a commentor on one of your HN submissions also said. I see you've pivoted/changed it around it bit over the last 2-3 months, so that makes it harder. Here's a three-step exercise that might help:

(1). What did you feel was cool about your basic idea (which seems to be to "wrap docker and kubernetics for easy microservice deployment in the cloud")? i.e. what excited you, what's really cool, about the technology idea, the original technological impetus?

(2). What benefit would other people feel, if they were completely ignorant and had never heard of docker or kubernetics or heroku etc? i.e. aiming here at the basic benefit. To illustrate, "easier to use than kubernetics" doesn't mean anything to someone who's never heard of kubernetics or its difficulties. (Many enterprise customers would be in exactly this perfect state of ignorance - they have enough to deal with with their actual domain!)

It wouldn't hurt to also consider what benefit that get from cloud hosting (assume they'd never heard of it, as unlikely as that is). Think of yourself as not converting fellow experts from other technologies, but as gaining adherents for the first time - they don't actually want your tech. They want the cloud and you are just the ladder.

(3). Finally, think about target users again, now not as completely ignorant, but in terms of and their current situation: their needs (what their bosses/customers need), their plans, how they presently do things, and especially what technologies they have actually heard of and basically know what they are. [ for example, I've heard of the cloud and docker, but I don't really know what the big deal about docker is. I've heard "kubernetics", but I don't know what it is. ] Now craft a message that speaks to these people, given their present situation and perspective.

There will actually be several different target customers, with different situations and knowledge, that would require different messages to really speak to them.

For example, I'd like a really simple way to write a bit of code and get it hosted in the cloud, for free, to make a simple webservice (or website). It seems it could be (should be) simple, but it isn't. That would be sensational! And great publicity for paid users (freemium, as you're already using).


Hi - thanks so much for all the feedback and advice. Have just seen it and will reply fully in the morning (we've been super busy pitching for our accelerator's demo day in the UK - which is an awful excuse!)

Really really appreciate you taking the time out to reply and to investigate the tool. Would be great to get on a call sometime and discuss your thoughts further - my email is my HN username @stackhut.com. Thanks again!


Thanks so much.

Yes, we've pivoted a few times, first starting out with trying to be the 'github for live API's, which is an idea we love but found that the chicken-and-egg situation was tricky to overcome, and that most users just wanted a way to deploy their code to the code to support a webservice.

As such we pivoted to building a shared hosting service that let's users just write their code and deploy it.

Regarding your 3 questions, my preliminary responses would be (bear in mind we probably would need to think about this further!),

1) We were super excited by the idea based on our own use cases - I found it a nightmare trying to build a distributed back-end to process video in the cloud for a previous startup. This got in the way of building my product but had to be done. Similarly, my co-founder ahead ran into the same issue when he needed to perform web-rendering on the server and just needed an easy way to wrap up this functionality. At this point we though perhaps we could use containers to wrap up the code but found the container ecosystem was focussed heavily on dev-ops and not services, so we built this.

2) Yes, totally agree - our messaging needs a lot of work, we've actually become a lot better in real life but unfortunately haven't updated the website to reflect this!

The basic problem is that deploying your code to the cloud is hard. After writing your code locally, you have to do all kinds of orchestration to get your code into the cloud, run it at scale, handle faults, client integration, and more. We feel we can make this simpler.

As for why you'd want to move your code to the cloud, hmm, that's a good question. Why the cloud? Don't worry about physical infrastructure., have capacity and capability to scale up compute resources dynamically, ignore hardware maintenance and amortise sys-ops costs with other users.

Getting across why users want our code to get to the cloud, if they know what the cloud is, is tricky. Absolutely, becoming an easy way to get code into the cloud followed after this, as we realised just how hard it was. We've moved to the cloud but our tools haven't.

3) Yes, this is the hard bit. Lots of users have existing code, perhaps in the cloud already or locally deploy. They don't know or care about Docker / Kubernetes, etc - and shouldn't have to. We think Docker is a great building block to build services on top of, but we should be able to abstract it away and build on top of it. Eventually we'd hope we can get across our core message - deploy your code to the cloud as easy as running locally - without ever mentioning containers, docker, and more. Perhaps we can do that now, we just to update our message to get that across.

For now we're thinking of targeting client-side JS developers as an initial niche and going deep with that while we figure out our tech and message. but yes, the core message is the same - a really simple way to write some code and get it hosted in the cloud as simply as possible (and yes, with a free tier also). It can, and should, be simpler.


Thanks for such a considered reply!

"Github for live APIs" is a sensational idea! (BTW: Github adoption needed only git - already popular - and a browser.)

(1). I think it's significant that your motivating problems weren't to do with containers.

(2). I think "scale" is problematic, because although true and cool, only Amazon etc need it. But everyone loves simple, cheap and easy.

> We've moved to the cloud but our tools haven't.

(I know you meant tools in general, but) Although your StackHut tools are in the cloud, their local installation makes them seem local to users. And is a barrier to adoption. "Effortless to try" helps adoption.

Why not have webpage access, like github? If you have an API, wouldn't a (simple) webpage driver be easy? Start with a super-simple demo, with only entering the code (like the code for your python eg), automating the rest (derive IDL and fixed dependencies). Give it a hash URL so users can actually use it from their app. Don't require email/password til the user's hooked, and want more customization.

Of course, local dev is important for debugging and iterating (with familiar tools, not some textarea/ace editor). But if your containers spin up so quickly... is it actually crazy to develop and debug in the cloud? At least, for a demo?

BTW: did you see AWS Lambda's demo for code snippets: http://squirrelbin.com/ (it's so slow because they cloud each save, load and run - instead of save+run in the background, and leave you in the editor. Actually, they use JS not py, so they could do the whole thing "locally" in the browser, not needing the cloud until you've finished iterating/debugging, to save. NB: you can't run their code snippets as a service, so they totally missed the cool opportunity "save to deploy!". Their article: https://aws.amazon.com/blogs/compute/the-squirrelbin-archite... )

(3). If you do focus on JS, it can run in the browser... making it fast to iterate/debug, then "save to deploy".

---

For 10 years, I've thought components over multi-core (now "microservices") will be a revolution - so complete, that even entirely local software will use the architecture. This is because components are modular; and services are a way to distribute computing over multiple cores.

But the arguments I see for microservices are not compelling. Scale isn't needed by most people. Multiple languages are not a key issue for most (and solved by RDB, XML, middleware etc anyway). And it creates new problems.

Yet, I still feel it's revolutionary. And I think your approach (and AWS Lambda) may be the answer: it's not because it's clever and cool, but because it will be cheap. I think it will be "Cheap" for the same reasons the services in the cloud are cheap, but finer grained.

For devs, the finer grains enable components to be loaded only when needed, and only with the dependencies and RAM and infrastructure etc needed. Sort of, iOS's app thinning, or Go's tree shaking, but for resources, at the component level. Needn't be done meticulously to get real benefit - often as easy as "our business logic is simple, but the image processing part needs 8GB RAM".

Similarly, cloud vendors can sell more capacity if it is finer grained (because, geometrically, there are less gaps between the grains). They can therefore sell it cheaper. It is simply more efficient for boths devs and vendors.

Is that how it looks from the inside, too? Is it significantly more efficient, and therefore cheaper?

Because cheaper, simpler, easier is the history of computing (and all technology).

PS: I'm taking a break before replying to the other one - glad you like "json by eg" schema.


I meant to reply here but replied to the wrong comment. This -> https://news.ycombinator.com/item?id=10208989 is meant to be here.


[cont] On second thoughts, "scale" is important, because multi-core: (1). spinning up an instance per request is far slower than local, but the second, third, thousandth is the same speed. At some point, the cloud is faster, because it's multicore. (2). the threshold of "fast enough", which means we needn't actually beat local; (3). and the cloud is cheaper.

I was thinking of "web scale" before.


I read through all those links, and even got your example working (on a phone - no curl etc - so had to write a litle java http client). Consequently, this comment is long. I hope it's useful to you!

BTW: I personally would like to see your idea usable on a phone - without a full local machine (iterating might be a pain, but your system sounds really fast). Not just serverless, but machineless! A currently underserved niche.

> deciding on just the right features to keep things manageable is incredibly difficult (more so than the tech itself I believe).

I agree implementation is the easier part, though we've been stuck so long, I think there must be a simpler way to look at the whole thing, involving some mathematical or algorithmic insight (as relational algebra did for databases).

Barrister: I sometimes get lost in special cases, and forget the main point that provides the fundamental help to people: Barrister seems feature-full, but from reading the first paragraph or so, it seems to only output docs - because that's all they say it does! If the reader already knows what Thift etc do (i.e. something of an expert, across the field), they could guess that maybe Barrister does more... but the users who just want the main thing you do are easier recruits. Perhaps this is partly why it's overlooked...

StackHut: I'm not sure you need a separate IDL, if it is generated automatically - why confuse the user with it? It seems like a more sophisticated customization tool, that you could leave aside for later? The IDL itself is pretty clear, and although I'd thought about (eg) java classes as defining a schema, I hadn't made the connection that IDL (as from CORBA) also do that.

(1). RPC format. I'm familar with OO serialization and schema languages (unfinshed PhD, book chapters, a library and business), but less so for RPC - so maybe I'm off-base here. And standards - even nascent standards - may be worth complying with. But why not omit the meta stuff, and make it even simpler:

  {
    "stackhut/web-tools":
    {
      "renderWebpage": ["https://stackhut.com", 711, 393]
    }
  }
I'm really just wondering if there's a strong reason for the meta data. It can be helpful to orient the reader, but here it is clear from context - the keys and how it's being used. Maybe there are other optional fields you sometimes need?

But... for your use-case, perhaps it doesn't matter that much, as the bindings hide it from users (but the point of text protocols is human readable, eg for debugging; so the simpler the better). You could use XML, or a binary protocol.

(2). JSON by example: This is my great idea for JSON schema, which I'm amazed no one has done yet: instead of another meta-format, do it by example:

Because JSON primitive values are typed, you can use a value to signify type. An object therefore also implicitly defines its type (like a java class). For example, the above JSON can also be used as a schema, because it indicates the two nested objects required, and the types of the primitives (string, number, number). Though I suggest a convention of using "", 0 and false as values.

Those zero values help convey that it's a type, not a value. It's temping to want to encode information in the value itself (as opposed to the type), such as a default value - but that maybe a mistake, because there is so much more than can be done with strings than with numbers. Keep it super simple.

[ Arrays are usually not fixed-length, enabling the next trick to encode optional values and polymorphism. ] The JSON spec allows duplicate keys: so you include the same key with different types for all the polymorphic types. Specifically, you include the null value to indicate it is optional. e.g. version is an optional string:

  {
     "version": "",
     "version": null
  }
This is a bit dodgey, because all JSON parsers simply return one value if duplicate keys are found. You need to write your own parser. But it is valid JSON - and more importantly, it looks like valid JSON. That's the key idea of "by example" - it looks like what it represents; there's minimal cognitive leap from type to instance.

It's rare to want polymorphic primitive types (e.g. string and number), so this is more for polymorphic objects. Unlike the common trick of a "type" field for nominal polymorphism, this is structural polymorphism - where only the different fields distinguishes types. NB: there are some tricky cases here, when the fields overlap, and I'm not sure that client code would want to mess with it.

Finally those non-fixed length arrays: polymorphism is represented by the types of a set of values in the array. In other words, the values aren't ordered, but just represent the permissible types. eg:

  [
    { "image_url":"", "width":0, "height":0 },
    { "text": "" },
    { "link_url":"", "link_text":""}
  ]
That's a schema for an arbitary-length list that can contain instances of those three types of objects, with those mandatory fields, of those primitive types.

NB: for a schema of the RPC "header" above, the fixed length array schema represents a fixed length array - a special case.

(3). primitive datatypes: This idea can be extended, in a second level, with explicit primitive datatypes - this is the next level of schema power that everyone wants. Every value is now a string, and looks something like: "url", "date", "email", and then gets closer to XS, with ranges like "int:1..31" and even those ridiculous regex defining valid values (great idea, awful in practice, like the regex for email). The key thing is it still is JSON and looks like JSON, since the datatype specification language is just a JSON primitive value (string), and the syntax and meaning is obvious and familar.

BTW minor typo on your website: s/intergrated/integrated/ (like integer)


Oh, can't derive IDL from JS, because no static types. Could add type hints, or type inference (partial), or just an actual IDL. Still easy to use!

JS (and py) devs aren't keen on static types... might be a hard sell.

Java seems a perfect fit: static typed, adopted. Also cloud-like "application servers". But already has CORBA and IDL's and serialization - just not popular (why not? maybe goes back to our previous comments) What advantages do you have over that? Uncool, but might be an easy sell.

BTW: There's something tantalizingly puzzling about the whole "JS static types for RPC" thing... not dynamic, not REST... Big picture is not yet clear to me.

REST is popular (eg http://swagger.io to generate APIs and docs); static types mismatch JS, but help RPC (very interesting and popular thread from 3 days ago on "statically linking" microservices - enterprise people concerned https://www.reddit.com/r/programming/comments/3k8sb2/how_we_... )


Wow - thanks so much for going through all the links and getting an example working - on a phone no less!! :) very impressive! Hmm, machineless, now that is interesting - I imagine with phone processors such a thing will become, if not already is, possible.

Yes! I wish there was a way to really define and lock-down exactly what communication/serialisation primitives are required to aid communication between systems. Though I imagine you are right in that we've come quite far with common practices.

Haha - yes the Barrister docs can be a bit confusing - we are getting around to writing a simpler version of them with common examples and use-cases. We really like the schema itself as it does map onto JSON semantics quite nicely.

1) Yep, the messaging format could def be simpler! We just felt that as we're starting out it would be better to stick with standards, even if, as you say, they are nascent. This reduces the amount of things we have to do but also we hope that people much smarter than have thought about the issues involved! Hopefully the client-side bindings will hide most but if needed it's nice to now you can drop down to the JSON format.

2) Hmm, JSON by example, this is super interesting!! I fully understand where you are coming from and it seems so much simpler than JSON-Schema (I was never a fan) Using values as types is quite elegant, and reminds me of some of the type-level programming stuff in the functional circles.

The use of multiple entries in a JSON object seems like a nice way to express sum types - something we are really keen to add to the Barrister IDL (you can extend from an object but I believe there is only a single tree). As you note, although this valid JSON, it may require a custom parser to extract. I'd love to hear more about this technique tho, has it been used elsewhere or is there any further documentation?

3) Primitives - yes, it'd certainly be possible to specify other primitives by encoding them in the JSON string value. Could be risky, as you suggest, when you start looking at regexes and so on - I've not been the biggest fan of defining these as types in the past. Also, how would one go about defining a user-defined type at the equivalent level of a primitive? Super interesting tho, if you have any more thoughts on this I'd love to read them.


Rereading, my comment was utterly misleading about "machineless"! I meant no local development. All in the cloud + browser (or another client). So a phone just needs a browser. [see my other reply]

I missed the Barrister IDL-JSON binding... kinda important! I must check how they bind choice/sum/polymorphism to JSON...

2) JSON by example schema: Thanks! It's never been used; the above comment is the only documentation. Maybe I should write a RFC... or at least first draft of a spec. And a reference implementation.

Actually using this schema to validate JSON instances requires using all the fields to determine which branch you have. They can have fields with the same name, provided the entire set is distinct. So these are OK (letters as fieldnames):

  {a,x}+{a,y}; {a,b}+{a}; {a}+{} (empty object)   
Thinking further, duplicate fields implies a different data model, and most JSON parsers will use a hash. Maybe it would ease adoption to fake up a syntax (eg addr-1, addr-2, addr-3 for the "same" field, with escaping rules for '-'). I wonder wonder about usage: maybe apps don't use this approach (same field can have different kinds/types of values); maybe they use a field for each type, only having a value for one of them? (or an explicit "type" field). It's important to model common practice.

But I think JS coders do use polymorphic behaviour (same method names with different code).

3) Add more primitve types. Thinking further, although people end up wanting more precise primitive types for storing data, if JSON is mainly used for transferring values between languages, it really need only be as expressive as the languages themselves, which generally don't go into details like syntactic nature of primitive values and ranges of integers and lists etc. (They wrap these concepts in objects; JSON can too).

I'm stepping back from the idea - I just liked that you could add richer types without losing the property of looking like JSON - but you do lose the property of representing types with the types of JSON values. Can leave the extra sophistication to xml schema (and json schema) for those who need it. Unless JSON itself add more primitive types (which I doubt!).

PS: one more comment to go. I plan to reread your previous comments on CORBA etc now I have a better idea where you're coming from.


Re: schema/RPC layer

Barrister semantically seems like the java type system with minor syntactic differences like field/type order. It uses extends for choice/sum/polymorphism.

In addition to schema definition, it also has interfaces for RPC definition.

The only major limitation I noticed is it lacks recursion (cyclic type references). Important for completeness, but I don't think it's needed that often in practice - I'm interested if this is true.

One great thing about an IDL approach to schema definition, for RPC, is its similarity to objects/classes - because that is the domain it is used from. It's familiar to users, and minimal cognitive burdern to switch between. (Similar argument for JSON - it's an "Object Notation"). In contrast, xml schema looks utterly different from objects/classes.

I think there's even a hypothetical argument for making the IDL look exactly like java - not because it's great but because everyone knows it, even the haters (or, maybe borrow Python's type hint system - or whatever is familiar to your target users).

XML schema has a rich type system, both for structure (eg minOccurs) and primitives. But these aren't needed to represent objects, because programming type systems are poorer and lack those features.

---

I think your main problem is a very old one, of transferring primitive values between different languages (like different languages - even different C implementations - having 16 bit vs 32 bit ints). Text representations like XML and JSON mostly solve this, but of course they can still be too big. You might look at old solutions, like ASN.1, to see the problems that arise, but I think you're better off solving much problems when (and if) they occur - because maybe they just won't come up in practice today because of conventions unconsciously established, even though theoretically they should come up. Just make it as easy to use as possible, to get the thing done that the user wants to do, and you'll get all the feedback/guidance you need.

Maybe you would like to have a type system subset that works for all languages - but this would just hamper those in the one language, or between two languages that are close. OTOH some enterprises like future proof options, in case they need to support other languages.

Would love to hear your thoughts and experience on all these issues.


Yes, we've been very happy with Barrister so far, in that it provides a simple syntax for RPC definitions that looks very similar to Java so is not hard to learn. And yes, keeping it similar to standard function/object calls lowers the cognitive load.

Hmm, yep recursion not supported atm, although we are thinking of extending the basic system shortly - there are a lot of other features we'd like to add - true sum types, a dynamic operator, and more.

Python annotations are nice, and a lot of users have asked for a way to annotate RPC functions in their code. Doing this all in a cross-language way will be hard and so we're thinking of keeping the external definition in the meantime.

--

Thanks so much for your advice. We're certainly looking at past attempts, and the research in transferring data is vast - I think it boils down to a trade-off between expression and usability - and for now JSON seems to have found pretty good sweet spot. Perhaps with just a simple schema layer on top that will be suitable for most use-cases. I imagine Thrift, ProtoBufs, etc will always beat it on speed, XML (+XML Schema) on expressibility, etc.

For us, as you have guessed, we are just trying to make it as easy as possible for the majority of users – particularly web and mobile developers. This does also mean we're trying to stay language agnostic and find a type subset that works across most languages - JSON primitives let us get some of the way there.

Thank you so much for all your advice - it's been so helpful and we're really happy you took the time out to try it all and record your thoughts. Would love to chat further - please feel free to drop me an email at HNusername @stackhut.com.


Do you find recursive types are needed in practice, for RPC? JSON lacks references (so it's an object tree, not a graph), but that hasn't stopped adoption...

What's a "dynamic operator" in this context?

---

> The first RPC was great and CORBA came and went, following this the perception was that it was utterly unsuitable for anything, until perhaps the introduction of ProtoBufs, Thift, JSON-RPC and so on. I personally think it can be incredibly useful, but deciding on just the right features to keep things manageable is incredibly difficult (more so than the tech itself I believe).

[from your grandparent comment] Yes, there's something in this space, but I'm not sure exactly what it is. Historically (there's pre-OOP papers on this), making the network invisible seemed really cool (still does!), but then all the Network Fallacies (https://wikipedia.org/wiki/Fallacies_of_distributed_computin...) got you. I think we've made real progress in general, e.g. JS programmers use async callbacks routinely, and there's promises; http is about dealing with the network. Not only is the tech worked out, but, perhaps more importantly, many coders are familiar with it. Whereas CORBA's superficial design defects (confusing API) obscured its fundamental design defects (distributed objects).

> trade-off between expression and usability - and for now JSON seems to have found pretty good sweet spot. Perhaps with just a simple schema layer on top that will be suitable for most use-cases.

Good summary!


JSON with a schema is still simpler than CORBA and DCOM, isn't it?

The former is simply an RPC mechanism whereas e.g. DCOM builds on top of MSRPC to provide distributed objects, which involves remotely creating instances of objects, which requires distributed reference counting, which adds all sorts of complicated timeouts and keep-alives to deal with network problems.


JSON with a schema seems broadly equivalent to SOAP.

I can believe that early implementations were buggy, but working with WS-* these days is actually very nice - nicer than JSON webservices, IMO.


"JSON schema seeming to finally gain traction"

Could you describe this further? Are you referring to http://json-schema.org/ or to more general efforts to bring typing to APIs, like GraphQL?


Sorry for late reply. I meant the former, json-schema. It's the one that seems to get used.

Other approaches might be better, esp for specific uses, but most people just want some typing, and json schema is closest to it. I hadn't heard of "GraphQL".


Of particular note, Joel Spolsky's missive on Architecture Astronauts: http://www.joelonsoftware.com/articles/fog0000000018.html


"the simplicity of component models, such as EJB" Sure the author must have been joking :-)

That aside, good points in this article. CORBA is emblematic of a time when elaborate architectures were created without any thought of actual implementation. At IBM in the early 1990s, we had an entire group of 40 people working on the architecture for a broadband network infrastructure who had never written a line of code.


> "the simplicity of component models, such as EJB" Sure the author must have been joking :-)

Perhaps not entirely. If that's the kind of problem you have, there may not be any possible solution that is much simpler than EJB.

There's a reason all these architectures wind up being horrible. The problem is horrible, and there's no simple, clean solution for it.


I bet their UML diagrams were magnificent, though.


In the early 2000s I worked within a CORBA system. Daemons written in C++, servers in Java, and the system was scripted via Python.

That you could instantiate an object from the C++ or Java sides in Python and do things with those services was actually quite fun. I was insulated from the difficulties of the CORBA implementation because the system came from a vendor, but bugs and performance issues were perpetual. And this system ran within the same local network behind the firewall.

One of the truly sucky things about this system was it had its own version of Python server pages: Python and HTML intermixed within the same file, and whoa be to you if you got your Python indentation messed up between blocks of HTML. It was ridiculously hard to debug.

And as the article points out, upgrading was a stop-the-world process where everything had to go offline, usually for hours. One of the other crippling "features" of this system were C++ objects persisted in the database as binary... man those caused headaches. And half the data was in DB2 and half in an LDAP server. This vendor never met a technology they didn't want to through into the mix.


Particularly interesting should be the part about design by committee.

Remove Corba references, and it might apply to many other past and current pieces of technology:

"There are no entry qualifications to participate in the standardization process. Some contributors are experts in the field, but, to be blunt, a large number of members barely understand the technology they are voting on. This repeatedly has led to the adoption of specifications with serious technical flaws."

“Vendors respond to RFPs even when they have known technical flaws. This may seem surprising. After all, why would a vendor propose a standard for something that is known to suffer technical problems? The reason is that vendors compete with each other for customers and are continuously jostling for position. The promise to respond to an RFP, even when it is clear that it contains serious problems, is sometimes used to gain favor (and, hopefully, contracts) with users.”


I find this one to be very familiar as well. I'm sure we could come up with our own examples and combined have dozens or more.

"Vendors sometimes attempt to block standardization of anything that would require a change to their existing products. This causes features that should be standardized to remain proprietary or to be too vaguely specified to be useful. Some vendors also neglect to distinguish standard features from proprietary ones, so customers stray into implementation-specific territory without warning."


CORBA's "failure" was the distributed part, and distribution is an essential element of the architecture. Unlike COM, where you could easily separate the OO (IDL) based aspect of the architecture from its distributed implemention (DCOM), CORBA was always assumed to be distributed.

The idea of implementation language agnostic binary IDL is still powerful though.


> The idea of implementation language agnostic binary IDL is still powerful though.

IDL is a downright nice specification language when compared to alternatives with a similar power (I'm looking at you, ASN.1!).


ASN.1 wasn't so bad. In the 90's, I worked for a little company, Gradient Technologies, where we modified Kerberos to add authentication via Security Dynamics' key fobs. I hadn't seen ASN.1 before, but found it wasn't much effort.


Nice walk down memory lane. We used Orbix in 1995 to build a production banking platform, one of the first true web-enabled banking services, and it was a nightmare. It wasn't necessarily the complexity. Our engineers could understand proxies, stubs, and bindings well enough, and when it all worked it worked well. But memory leaks on NT and mysterious performance issues sucked all the life out of the project. Over the next few years I watched CORBA wither away in the face of growing adoption of simpler web-oriented protocols, and it never made me even a little sad to see :).


I still maintain the piece of backend which I wrote 8-10 years ago using Python and omniORB. Performance of CORBA was critical back then. But I have started to use CORBA for parts where performance and network overhead was not that critical, which corrupted my code beyond any limits. I was not very experienced back then. I still struggle and feel pain when I see that old code. Also omniORB has some weird memory leaks related bug, which I could not figure out in many years. Sigh...

Partially I have migrated most of component from CORBA to beanstalkd (MQ server). This is much more simple approach, testing is easier, code flow is simple. I cannot recommend enough this simple and robust MQ server: beanstalkd.

For the part when one component wants to call another component in Python and there is no way I could plug a queue into the flow, I prefer to use Pyro. Dirty and simple.


How CORBA compares to Apache Thrift, or Google's protobuf? or do they try to solve different things?


I consider CORBA's IDL/IIOP combo to be roughly identical to protobuf. CORBA's distributed RPC capability is like gRPC. At a higher level there are some differences but they are substantially isomorphic.


CORBA was terrible at versioning. If you tried to use a client with a mismatching server (e.g., the server's IDL added a field that the client didn't have), you would typically get segfaults.

Protobuf, on the other hand, builds in versioning right from the start. Every struct field is tagged with an ID, so unknown fields can simply be ignored, and marshalling/unmarshalling can be adaptive. It means that structs can be preserved across version boundaries; in theory, a "v1" client can read a "v2" struct, modify it, and pass it back with the "v2" fields intact, even though the client didn't know about them. (This requires that the client doesn't unmarshal the data into something that would lose the metadata, like a C struct.)

Another big difference is that CORBA had IORs (Interoperable Object References), a kind of smart pointer to a remote object. With CORBA, as with DCOM, you could pass object references around and make method calls on them, and the calls would be transparently routed to the correct server. You could have client A get an object from B which got the object from C, and if A did a .foo() call on the object, it would call C. Of course, this leads to all sorts of issues, such as having to make sure objects stay alive for as long as any client (or server, since it goes both ways!) has a reference to it, and dealing with unresponsive clients/servers.

gRPC is much simpler in this respect, in that it's just RPC calls, pure data, no objects. In that sense, gRPC is closer to DCE RPC (the basis for Microsoft RPC, which was the underlying RPC technology of DCOM) [1].

[1] https://en.wikipedia.org/wiki/DCE/RPC


Yes, I know all of this. I never saw IORs as a truly necessary feature, although I see the attractiveness of the idea. I'm sure you could, if you desired, implement IOR-like behavor on top of other RPC systems.

I never really had problems with message versioning because I owned the client and server, and created new messages when I wanted new versions.


Protobuf-style versioning gets pretty useful when you're developing microservices — where there are potentially a whole bunch of apps that would otherwise have to be upgraded at the exact same time, even for adding optional fields.


Nice article. I remember reading about CORBA in uni days. The article summarized it well as it became a bulky technology with very length and confusing documentation. Industry on the other hand, was ready to adopt simpler technology that focused on web-based protocols, like HTTP (HTTPS). Then SOA came and it changed the landscape completely with REST-based services, that relied on HTTP protocol themselves. Finally, it comes down to simplicity. A technology that is too complex to implement and understand will not easily meet demands of constant change and innovation. Hence, as mentioned, it is reduced to a niche technology.


You skip an important part where "web based protocols" means web services, which in turn was synonymous with SOAP. This architecture was based on remote method invocations and was dominant for the better part of a decade (if it isn't still, for some domains).


I thought the CORBA IDL and the IIOP were both great. It made the transition to protocol buffers and Google's RPC system pretty painless.


well, web services were a complete bust too!

It's interesting (as other commenters have noted) that now people implement rest interfaces and we don't seem to have any (other) standards for imposed (controlled) shared state.

Agent standards like FIPA were supposed to enable consensual shared state by agreement, but have never really been used to do that.

I guess the world really doesn't need these things yet?


In the end, simple, documented, defined (even if the documented/defined are simply the source you have access to) is easier to deal with than overly complicated systems that add little value, and a lot of cognitive overhead, confusion and indirection. This is why today's JSON services rule the roost so to speak.

Not that there aren't attempts to create some standards... It's just that a lot of the time they don't add much value.


CORBA (and SOAP) had type safety, something which is missing from "modern" systems.

JSON is an ill-defined standard - integers, for example, are almost completely undefined, and this causes real problems in real programs (eg [1]).

[1] https://lists.gnu.org/archive/html/qemu-devel/2011-05/thread...


> CORBA (and SOAP) had type safety, something which is missing from "modern" systems.

You had the illusion of type safety. In practice, a robust distributed system needs to check all messages at each node -- ie, check types at runtime.

Any distributed system is, in fact, a dynamically typed system, no matter what the headline says.


Integers don't exist in JS/JSON... You get IEEE754 double precision floating point numbers, meaning whole numbers between -(2^53 - 1) through (2^53-1) are supported without rounding issues.

If you need greater precision or fixed decimals, use a base10 string and an appropriate library in your application.

I'd say the lack of strict date/locations in the spec are probably as big of an issue, but there are standards for how to handle that that have evolved as well... (geojson and ISO-8601 date-time strings).


You've made my point with "If you need greater precision or fixed decimals, use a base10 string and an appropriate library in your application". In other words, it's not interoperable or type safe. BTW the JSON spec itself doesn't define integers at all. "Numbers are really floats" comes from Javascript and hence is just a convention for people who are using JSON from another language.

The larger problem is the lack of schemas. In JSON they exist but no one uses them. CORBA forced you to have a schema (IDL). SOAP had schemas, albeit very complex ones which no one really understood.


"JavaScript Object Notation" ... it makes perfect sense for JSON numbers to match JS numbers, which is IEEE spec.


I find the lack of strong type assertion and checking is always alarming... I get a real sense of vertigo without type checking.


Interesting article. CORBA is still hanging around in Avid's EuCon protocol for controlling audio applications. Not always that reliable (but hard to say where the blame lies there).


CORBA also persists in the Ada community (e.g. see http://www.adacore.com/polyorb/).


Those working on other standards should take heed.


I thought this was the rise and fall of COBRA (gi joe) and when I realized it wasn't I was sad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: