There are many ways in which something can be simple. I believe that the most re...

fauigerzigerk · on Aug 21, 2023

>I.e. learning about namespaces would take a programmer couple of hours, including a foosball match and a coffee break

It's not about the time it takes to learn about namespaces. I'm talking about the complexity that namespaces and entities add to the data model and the requirement to actually handle them throughout the entire stack.

You can normalise and compare arbitrary pieces of JSON using only information available locally in that same sequence of UTF-8 bytes. You cannot do that with XML. You have to consider the whole document context and resolve all namespaces and entities before actually comparing anything.

The JSON specification is ~5 pages and most of that is diagrams. The XML specification is ~40 pages long and it imports ~60 pages of URI specification.

I'm not saying that it's impossible to use only the simple parts of XML unless and until you actually need what namespaces have to offer. But that's culture, and you have no control over other people's culture.

crabbone · on Aug 21, 2023

> I'm talking about the complexity that namespaces and entities add to the data model

I've worked a lot with XML, and I have no idea what complexity are you talking about. This just wasn't complex / difficult. Once you've learned what this was about, this was your second nature. Eg. I spent a lot of time working with MXML -- that is an XML format for Adobe Flex markup similar to XAML and a bunch of others of the same kind. It used XML namespaces a lot. But that was the least of my problems using it...

Again, I've never had anyone who learned how and why to use XML namespaces complain about it. All complaints about this feature were coming from people discovering it for the first time.

> You can normalise and compare arbitrary pieces of JSON

Dream on. No, you cannot. It depends on parser implementation. For example, you have two 20-digit numbers where 15 most significant digits are the same. Are these numbers the same number or a different number in JSON?

The fact that it's 5 pages means nothing... it's 5 pages that define a bad language that creates a lot of problems when used. So what if it only took 5 pages to write it? You can probably squeeze Brainfuck definition into half a page? -- So what, it's still a lot harder to use than JavaScript.

fauigerzigerk · on Aug 21, 2023

I worked with XML extensively for many years starting back in the 1990s. When I'm saying that namespaces add complexity to the data model I'm not complaining about them being difficult to use or understand.

>Dream on. No, you cannot. It depends on parser implementation. For example, you have two 20-digit numbers where 15 most significant digits are the same. Are these numbers the same number or a different number in JSON?

That's just a mildly interesting interoperability edge case that can be worked around. I agree that it's not good, but it is a problem on a wholly different level. XML elements not being comparable without non-local information is not an edge case and not an oversight that can be fixed or worked around. It's by design.

I'm not criticising XML for being what it is. XML tries to solve problems that JSON doesn't try to solve. But in order to do that, it had to introduce complexity that many people now reject.

Edit: I think we're talking past each other here. You are rightly criticising the JSON specification for being sloppy and incomplete. I don't dispute that. I'm comparing the models as they are _intended_ to work. And that's where XML is more complex because it tries to do more.

euroderf · on Aug 21, 2023

> XML namespaces .. All complaints about this feature were coming from people discovering it for the first time.

"XML Namespaces: Giving developers the vapors since 1999."

geysersam · on Aug 21, 2023

I don't understand the problem you're describing. When would using JSON lead to data loss/corruption?

crabbone · on Aug 21, 2023

Here's a thing that happened in the wild. Neo4j database encodes ids of stored entities as 128-bit integers and it has a JSON interface. When queried from Python, the Python client interprets digit sequences longer than what could possibly fit into 2^32 as floats (even though the native kind of integer in Python is of arbitrary size).

So, for a while there weren't too many objects, ids appeared to be all different... until they weren't. It's easy to see how this led to data corruption, I suppose?

---

Here's a hypothetical example: few people are aware that JSON allows key duplication in "hash-tables", also, even if they consider such a possibility they might not know that JSON doesn't prescribe which key should win, should there be many of them. They might assume that the definition requires that the first chronologically wins, or last, or... maybe some other rule, but they hope that it's going to be consistent across implementations.

Obviously, to screw with developers, JSON doesn't define this. So, it's possible that two different parsers will parse the same JSON with the same fields differently. Where this could theoretically explode? -- Well, some sort of authentication which sends password with other data that can be added by user, and the user intentionally or accidentally adds a "password" field, which may or may not later be overriden and may or may not later be interpreted on the other end as an actual password.

---

There are many other things, like, for example, JSON has too many of the "false" values. When different languages generate JSON they may interpret things like "missing key" and "key with the value null" as the same thing or as a different thing. Similarly, for some "false" and "null" are the same thing, while for others it's not.

fauigerzigerk · on Aug 21, 2023

>few people are aware that JSON allows key duplication in "hash-tables"

I would say it's the other way around. Many people seem to think that duplicate keys are allowed in JSON, but the spec says "An object is an unordered set of name/value pairs". Sets, by definition, do not allow duplicates.

https://www.json.org/json-en.html

>There are many other things, like, for example, JSON has too many of the "false" values. When different languages generate JSON they may interpret things like "missing key" and "key with the value null" as the same thing or as a different thing. Similarly, for some "false" and "null" are the same thing, while for others it's not.

I don't see how this is a JSON issue. There's only one false value in JSON. If some application code or mapping library is hellbent on misinterpreting all sorts of things as false then there is no way to stop that on a data format level.

What I do agree with is your critcism of how the interpretation of long numbers is left unspecified in the JSON spec. This is just sloppy and should be fixed.

bryanrasmussen · on Aug 21, 2023

if you are making JSON data to use in your language and application you will probably not have any problem. But as in any thing there can interoperability issues between implementations and programming languages - especially if your JSON is being generated and consumed by your JavaScript site

some potential issues https://bishopfox.com/blog/json-interoperability-vulnerabili...

on edit: not parent commenter of course, just what I think they might have meant.

geysersam · on Aug 21, 2023

That's very interesting! Thanks for the link.