Schemas built into the message certainly have the benefit of being self-describing. But they also have downsides:
- encoding the schema along with the message makes encodings like this less efficient.
- without an ahead-of-time schema, you don't have any canonical list for all the fields that can exist and their types. Instead this gets specified in ad-hoc ways in documentation. For example, like this: https://developers.facebook.com/docs/graph-api/reference/v2....
That URL describes a schema for groups. The schema exists, it's just not machine-readable! That means you can't use it for IDE auto-completion, you can't reflect over it programmatically, and you can't use it to make encoding/decoding more CPU/memory efficient. It's so close to being useful for these purposes, why not just take that final step and put it in a machine-readable format?
You can have self-describing messages without having to embed the schema in the instance. Instead you embed a reference to the schema in the message. We came up with an approach to this called self-describing JSONs: http://snowplowanalytics.com/blog/2014/05/15/introducing-sel... The Avro community do something similar.
- encoding the schema along with the message makes encodings like this less efficient.
- without an ahead-of-time schema, you don't have any canonical list for all the fields that can exist and their types. Instead this gets specified in ad-hoc ways in documentation. For example, like this: https://developers.facebook.com/docs/graph-api/reference/v2....
That URL describes a schema for groups. The schema exists, it's just not machine-readable! That means you can't use it for IDE auto-completion, you can't reflect over it programmatically, and you can't use it to make encoding/decoding more CPU/memory efficient. It's so close to being useful for these purposes, why not just take that final step and put it in a machine-readable format?