Hacker News new | past | comments | ask | show | jobs | submit login

> Polymorphic JSON. The protocol elements have different data types that convey additional contextual meaning, allowing us to avoid mutually exclusive protocol elements and design a more succinct and readable protocol.

Yeah... let's not please.




The number of comments here specifically upset with this part of the current design is a bit discouraging, but not necessarily surprising.

Yes, many mainstream languages have near-zero support for Tagged/Discriminated Unions or Enums with Associated Data or Algebraic Data Types (pick your favorite name for the same concept). This is a limitation of those languages, which should not force a language-agnostic protocol to adopt the lowest common denominator of expressiveness.

Consider the problem they're avoiding of mutually exclusive keys in a struct/object. What do you do if you receive more than one? Is that behavior undefined? If it is defined, how sure are you that the implementation your package manager installed for you doesn't just pick one key arbitrarily in the name of "developer friendliness" leading to security bugs? This seems like a much more bug-ridden problem to solve than having to write verbose type/switching golang/java.

Implementing more verbose deserialization code in languages with no support for Tagged Unions seems like a small price to pay for making a protocol that leaves no room for undefined behavior.

To be clear, _many_ statically typed languages have perfect support for this concept (Rust/Swift/Scala/Haskell, to name a few).


> To be clear, _many_ statically typed languages have perfect support for this concept (Rust/Swift/Scala/Haskell, to name a few).

No they don't, at least in the way you're selling it. The "limitation" here is JSON which doesn't attach type information. You're going to have to implement some typing protocol on top of the JSON anyway which will face similar problems to the ones you raised (unless you do some trait based inference which could be ambiguous and dangerous).

If they were Enums/Unions over a serialization protocol like protobuf, maybe your case makes sense. Even then, Im guessing a large % of the OAuth 3 requests will go through Java/Golang libraries, so on a practical level this is a bad idea too.


I agree that having multiple different types of "object values" share one JSON key with no explicit "type" tag is asking for trouble with extensibility and conflicts.

That said, I think the constructive suggestion would be: "add a type tag to all objects in a union" (something suggested elsewhere in this thread).

Their "handles" can still claim "just a string" to save bandwidth in the common case, arrays can still represent "many things" and objects require "type" to be dis-ambiguous.

Most of the comments below don't mention the (real and important, but easily solvable) issue you've brought up however. They primarily fall into one of two buckets:

- It's hard to work with data shaped like this in my language (ex: java/go)

- It's hard to deserialize data shaped like this into my language that has no tagged unions (ex: java/go)

My biggest counterpoint to all of these complaints is: The fact that your language of choice cannot represent the concept of "one of these things" doesn't change the fact that this accurately describes reality sometimes.

A protocol with mutually exclusive keys (or really anything) by convention is strictly more bug-prone than a protocol with an object that is correct by construction.


A protocol which is cumbersome to implement in many languages. Hmmm what can't go wrong. Partial support, late support of extensions, bugs,...

IMHO: a very bad choice. Complicated basic and higher level elements of protocols are the death of them (remember SOAP). I follow the train of thoughts to not restrict yourself too much but if (eg) java or C++ cannot implement it easy, not a good idea.


Protobuf supports "oneof" which is also cumbersome to implement in these same languages but all of them support it (with some extra LOC and no exhaustiveness checking watching your back).

Java/Go/C++ are perfectly capable of parsing a "type" key and conditionally parsing different shaped data. If you make a programming mistake here, you'll get a parse error (bad, but not a security problem). The pushback seems to be that a Java/Go/C++ implementation adds LOCS and won't gain much by doing this extra step so lets make the protocol itself match match their (less precise) data representation.

FWIW there is work towards improving Java in this regard: https://cr.openjdk.java.net/~briangoetz/amber/pattern-match....


But is not that elementary OOP polymorphism? It all depends on the fact whether the type is annotated or whether it needs to be analyzed from the data by probing. And types annotation are present in protobuf parts I remember :).


> This is a limitation of those languages, which should not force a language-agnostic protocol to adopt the lowest common denominator of expressiveness.

It's an intentional decision made by those languages in order to focus on other things. If your intent is to be language-agnostic, then yeah, going with lowest common denominator concepts is exactly what you need to do. If you just want to write a Haskell auth implementation using your favorite pet language features, then write a Haskell auth implementation.


It's not the same as union types, but you can also often achieve polymorphic serialisation with any OO language, through the use of interfaces.


My fingers are firmly crossed that DUs make their way into C# 10... https://github.com/dotnet/csharplang/blob/master/proposals/d...


I'm in vigorous agreement here--polymorphic JSON is far more difficult to deserialize safely, and every instance I've seen of this in the wild has been the output of careless or deeply ignorant makers.


Yeah, put a `type` tag in there and call it a day.


I'm not sure I get this, does the data type change depending on context? Is that what they mean?


It means sometimes you'll get:

    { foo: { bar: "baz" }}
and other times you'll get

    { foo: "something else" }
Good luck!


Polymorphic JSON is such a PITA for strongly typed languages.

    var data map[string]interface{}
    //.. 
    switch t := data["foo"].(type) {
         case string:
         case interface{}
    }

imagine that for every key...


In my humble opinion, Golang is pretty verbose in a bad way when it comes to JSON. Rust and the crate serde_json are also strongly typed and it's a lot better.


Yeah, having done this, it's extremely verbose if you have all your error handling in there due to the verbosity of dealing with JSON and explicit return values instead of exceptions. By far my greatest annoyance is that []interface{} cannot be directly cast to []concreteType. Having to make a sized slice and type assert each value is annoying. Require validation of the values for even more "if err != nil" fun.

Many Go advocates seem to consider the verbosity a feature, because it's explicit and forbids any clever-but-confusing tricks.


How would you express deserializing these 'polymorphic JSON' objects using serde/Rust?


Using a rust enum I guess.


You're right, this does seem to work [1]. I wasn't aware that serde would attempt to deserialize multiple enum variants until something matches.

[1] - https://serde.rs/enum-representations.html#untagged


Yeah that's a rather common scheme out there so Serde does provide built-in support for this deserialisation. Probably better for deserialisation performances to use properly tagged enums, but if you don't have a choice Serde's got your back.


    { foo: { length: 1.7976931348623157e+308 } }


So, like Apple and in app payments then...


> XYZ’s protocol is not just based on JSON, but it’s based on a particular technique known as polymorphic JSON. In this, a single field could have a different data type in different circumstances. For example, the resources field can be an array when requesting a single access token, or it can be an object when requesting multiple named access tokens. The difference in type indicates a difference in processing from the server. Within the resources array, each element can be either an object, representing a rich resource request description, or a string, representing a resource handle (scope).

This is horrible.


This is definitely not a protocol dreamed up by, say, Java developers.

Obviously it's doable in Java, but I'm hard-pressed to imagine a Java developer would think of such a thing.


The only time I've ever encountered an API that used it extensively, it was done by a company that does all their implementation in Java.

My best guess at what happened, based on the shape of the API, is that they implemented it by taking their pre-existing domain model, which had a fairly deep subclassing hierarchy, liberally sprinkled some annotations from com.fasterxml.jackson.annotation, and dumped the result straight onto the wire.

You could absolutely do an object-oriented codebase where two different subclasses have fields with the same name and different types, and, depending on how you structure your code, it might not be too painful. And it's fairly easy to imagine someone serializing a structure like that to JSON without ever meditating on the fact that JSON won't retain the all the type information.

Ironically, the end result was an API that is nigh-impossible to consume from Java. I ended up writing a façade in Python.

I've also seen little bits of this happen in the internal APIs at my current company. Also a Java shop, also a result of trying to directly connect an internal object model to the API. I've never seen it done in an API implemented in a dynamic or functional language.


I hadn't considered the one-way nature of dumping Java classes to JSON.

Mainly because I work with classes that are serialized both to and from JSON, rather than having Jackson annotations added much later.

I can see it now. Horrifying! I'm sorry for your troubles.


My take-away is this: Protocols should always be defined independently of any existing code.

The code-first approach only works well when you're doing something self-contained. Which, an API, almost by definition, is not.


I generally try to design APIs from a consumer perspective, and then usually I end up with something RESTful. Let the server do whatever it has to do, you know?


This is a few magnitude even more infuriating than a spec I have to work with at work, where some keys are required to be URI...


I work in Typescript, which is arguably the best mainstream language suited to type such things, and still. This is insane.


Lovely - I can't wait to see how horrible something like that is to implement in a language like Go.


You have to implement UnmarshalJSON. Between each attempt to deserialize into a possible struct, be careful to return on errors that are not JSON serialisation errors (for example caused by reading from the underlying Reader, etc.)

It's ugly and verbose but there is no need to use empty interface.


Sounds like a golang problem. Rust can easily handle it with union types from what I have seen.


I hope you enjoy working with interface{}

_sigh_


interface{} to the rescue


Yeah, pretty much exactly that.

The only time I've had to deal with it in the wild was a terrible experience. As a consumer, you couldn't make any decisions with confidence without first making a careful study of the documentation. For every. single. decision.


YAML has !Tag syntax for expressing polymorphism. Shame that nobody uses YAML because the standard is so complex, and shame that AppVeyor uses a single-element key-value mapping instead of a tag for this purpose...


Yeah... This is the same sort of thing you see with, say, ActivityPub, that makes it a massive pain, if not totally impossible to implement it in a statically-typed language.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: