Hacker News new | past | comments | ask | show | jobs | submit login

Serialization has a very interesting influence on types; when variables are disposable you can use very expressive types and complex relationships without major consequences.

But when it comes time to serialize, the effort to map all the properties and relationships into an enumerated form can very quickly invade the codebase and dominate it like a lodestone. Either you "bash down" the most structured elements into more primitive forms so that version compatibility is maintained(leading you to code in a primitive style as well), or you escalate to the prospect of storing code as well as data(the "image" strategy).




It doesn't require you to code in a primitive style, however: It just requires acknowledging that creating a storage format is just as much an act of design as creating any other interface.

For example, "transparent" serialization really isn't because you get run into severe versioning issues. But dropping to primitives doesn't work well either, because then there's impedance to deal with. If you acknowledge there is bound to be impedance, you can encapsulate it in a separate mapping/specification, and deal with it on its own terms.

That lets you do things like evolve your storage format and internal model independently, without the very real drawbacks of an image-style system.


Encapsulation is only somewhat more protective than simply adding a version number on the file format. Not that it isn't helpful from a project-management standpoint, but in the really complicated cases(real world example: serializing the state of a video game down to the exact frame - many of the entities want to reference each other, numerous static resource hooks, etc.), there's an impedance at the moment you decide to serialize, and the design challenge is mainly one of deciding where to attack it:

1. By doing reflective magic (potentially a lot of it) at the moment of serialization so that a serialization-friendly form can be teased out. If you are not working within a reflective environment, this option doesn't exist without additional effort. And if the runtime changes the magic breaks.

2. By "primitivizing" the runtime so it's closer to the serialized representation, at the expense of dirtying the codebase to exchange e.g. object references for id numbers. You can make it look prettier again by enforcing conventions, but the extra weight is still there. Adding reflection within an environment that didn't have it built-in also requires going through this step.

Those two options are the design tradeoff. And the bigger your serialization ambitions go the more the pressure builds to either massively primitivize(e.g. use a SQL database for all data and make the special runtime representations non-persistent always) or make an image so that data never "actually" leaves the runtime.

In electing to use a database layer everywhere, web apps have more-or-less standardized on primitive designs, and subsequently try to claw back readable code through tricks like ORMs and DSLs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: