Hacker News new | past | comments | ask | show | jobs | submit login
Ultra-minimal JSON schemas with TypeScript inference (github.com/ar-nelson)
103 points by codewithcheese on July 28, 2022 | hide | past | favorite | 44 comments



After some frustration with the TypeScript schema library ecosystem, I've decided that I'd prefer to declare my types using TypeScript's excellent type syntax, so I can take advantage of generics, mapped types, etc. Then, I'll take those TypeScript types and compile them to whatever alternative schema format I need.

There are many libraries that claim to convert your Typescript types to other formats, such as ts-json-schema-generator, ts-to-zod, or typeconv/core-types-ts. These libraries work by interpreting the Typescript AST, essentially re-implementing a bare-bones type system from scratch. Most do not support advanced Typescript features like generic application, mapped types, or string literal types. So, what's the point? To avoid those limitations, I use Typescript's first-party ts.TypeChecker API to analyze types, and an existing library called ts-simple-type (which I've forked) to convert from ts.Type to a more usable intermediate format. Then, I recurse over the intermediate format and emit "AST nodes". It's pretty simple, but seems promising.

So far, I have compilers from TypeScript type to Python 3 and Thrift. But I plan to add OpenAPI/JSONSchema, Protobuf (Proto3), Kotlin, Swift, and maybe Zod and Avro. Each target language is around ~300 LoC so they're pretty easy to put together.

My ultimate goal is to use this toolkit with Notion's internal types, which are quite a bit more complex than something like ts-to-zod can handle.

Repo: https://github.com/justjake/ts-simple-type

Compiler input and output: https://github.com/justjake/ts-simple-type/blob/jake--compil...

Thrift compiler: https://github.com/justjake/ts-simple-type/blob/jake--compil...

Python compiler: https://github.com/justjake/ts-simple-type/blob/jake--compil...


> Most do not support advanced Typescript features like generic application, mapped types, or string literal types. So, what's the point?

I use @sinclair/typebox and the point is that I can get a JSON Schema out the other side, usable in-process and as a communicative description of what my code expects, just by evaluating my code and printing out an object.

It does support string literal types, FWIW. I'm good with it not supporting generics, because the places where I need an interchange format generally don't. Mapped types would be nice but aren't really critical to where I need to generate interchange formats; dependent types that result from transforming these can use mapped types, so it's close enough.

For my money, typebox has far-and-away the best user experience of any of the flavor of libraries you described, and IMO is probably worth studying.


I agree that the runtime-validador-with-type-inference library genre is generally enough for API surface area. We have a similar internal library, including JSON schema output, and it’s serviceable for checking our public & private API incoming requests.

It does come with some costs. First, the type inference using these large mapped types (at least in our implementation and/or scale) noticeably slows down our Typescript type checking; the public api files with our runtime type declarations always top our build profiling. Second, it’s annoying to need to re-implement an existing type as a runtime-type if you start to need it in an API specification. There’s a tension in the codebase between using the easily-available, language native type system, or making do with the runtime type stuff which increases verbosity/noise and decreases expressiveness.

We have thousands of lines of types, including many types inferred from `as const` literals, written in Typescript. Rewriting those types in another schema language, such as typebox or Protobuf, would take a bunch of time and spread that tension from an isolated spot (backend API handlers) into every part of the codebase. It imposes new constraints, and requires devs to learn more things.

The libraries I listed and the compiler I’m building try to address this issue by supporting the native typescript declaration syntax. No more red types and blue types, much better gradual adoption pathway, and ideally one less thingy to learn for most developers.


Wow, holy smokes. This is incredible, thanks for posting.


Looks promising. Thanks for sharing. This also needs a CLI. I was thinking about building something like this in Rust using swc.


Swc will give you an AST, but you’ll still need to re-implement mapped types, type inference, template string types, etc - a large part of the TS type checker. Although, the author of swc is supposedly building a full type checker.

> This also needs a CLI.

I’m not planning on implementing one at the moment. The project goal is to greatly expand the use-cases for Typescript types by making it much easier to build custom compile targets.

A CLI means directing much more of my energy to configuration logic and file system IO. As a library, it’s clear how users should “configure” behavior - write or subclass a compiler target and implement whatever you want!


Ah - I see. I've started to build a CLI around it and adopted the `SimpleTypeCompiler` example to build simple JSON schema files. I'm struggling with the API a bit, but I think I can manage.


I don’t know. Zod feels more reasonable path forward to accomplish the same thing (granted it’s not using the language of JSON schema if you have a dependency on that for some reason)


Zod is fantastic. Our whole front-end is built around it, with some React hooks that wrap up a `createApi` call around Zod decoders and expose them to components/containers. It's lovely. Only downside is those decoders can't be re-used across other languages, whereas technically this could be, I guess?


Zod is great, but JSON-schema is my pick because it is easily serialized. So a client can query for the schema of the data it should send or receive, including descriptions/metadata that is useful for presenting a UI.


Zod to JSON schema is a thing so I think that solves your problem, no?

https://www.npmjs.com/package/zod-to-json-schema


I didn't know about this!

But on the client I would use a JSON-schema validator anyways, otherwise I'd need something like "json-schema-to-zod"? Seems like a roundabout way to address my problem


You happen to know if there are libs/utils to achieve the reverse (JSON-schema -> zod)? Could be useful for one of my usecases (form validation via rules recievd as JSON)


Does zod let me walk/parse the definitions as data?


The last time I tried this, I remember feeling sad and needing to wrap Zod in my own layer of stuff.


I can’t recall if it’s particularly ergonomic, but technically I believe it’s possible.


I wish there would be an official TypeScript solution for this. I know there's a TypeScript goal that TypeScript types should not impact runtime behaviour, but I think converting arbitrary JSON from a file or network to a typed TypeScript object is such a common use case it needs a standardised solution. There's tens of libraries that try to solve this now in different ways, all working around this area where TypeScript doesn't try to help.


Deepkit is quite interesting in this area: https://deepkit.io/

It reimplements the type system as… a stack based VM that expands the TS typedefs if needed at runtime?

I would use it if the risk factor wasn’t so high. I’m too scared of needing to maintain their crazy cool stuff to use it.

I would love Microsoft to build those APIs first party and officially support them.

EDIT:

> I know there's a TypeScript goal that TypeScript types should not impact runtime behaviour

This is why I’m rolling up my sleeves to build my own typescript-to-X tooling - it seems like no one is gonna do it for me, just the way I like — so I should make it easy for everyone to do it the way they like.


To be honest it doesn't even have to do anything during runtime per se.

It just needs to have nice macro system.

It would solve this problem and others like ie. pattern matching.


there's always https://www.sweetjs.org/ that you can plug into your Babel pipeline


Sweetjs is unfortunately dead [0] for like 5 years now.

It also doesn't have any typescript awareness which is required to build this kind of functionality - you want to have static type introspection available in macros so you can generate code based on provided types.

[0] https://github.com/sweet-js/sweet-core/graphs/contributors


I made a similar thing back in time, unfortunately not a public code though. The idea itself is simple, it was approximately something like this:

    const kind = Symbol();
    type Spec = "unknown" | "string" | "number" | ... |
        { [kind]: "optional"; spec: Spec } |
        { [kind]: "array"; element: Spec } |
        { [P in string | number | symbol]: Spec } | ...;

    function optional<S extends Spec>(spec: S): { [kind]: "optional"; spec: S } { ... }
    function arrayOf<E extends Spec>(element: E): { [kind]: "array"; element: E } { ... }
    // and so on, you get the idea. these helper functions exist so that
    // you can write `optional(...)` or `arrayOf(...)` instead of the full specification

    type Validated<S> =
        S extends "unknown" ? unknown :
        ...
        S extends { [kind]: "array"; element: infer E } ? Array<Validated<E>> :
        S extends { [P in string | number | symbol]: Spec } ? { [P in keyof S]: Validated<S[P]> } :
        ...;

    function validate<S extends Spec>(value: unknown, spec: S): asserts value is Validated<S> {
        if (typeof spec === "string") {
            switch (spec) { ... }
        } else {
            switch (spec[kind]) { ... }
        }
    }
The actual implementation of course had to take care of implicit nulls and other caveats of the TypeScript type system.


This looks really nice as an alternative to JSON schema, which is horribly verbose. Like using RELAX NG compact syntax for XML schemas back in the day, versus horrible verbose DTDs or XSDs.

But I think the most natural way to express JSON schemas would be TypeScript type declarations. Is there a project that can take a TS type and generate a runtime parser/validator for it?

I thought maybe Quicktype (https://quicktype.io/) was that, but it looks like it only takes JSON as input, not TS types.


YES! I've introduced ts-to-zod into my codebase only yesterday, and it's served my needs exactly. It doesn't handle generics or mapped types, but that's a fair price to pay.

It also works the other way, you can define a zod schema and you can infer types for your data from it.

https://github.com/fabien0102/ts-to-zod


We're using TypeBox [0]: you use TypeScript to create JSON Schema's and it gives you a compile-time type.

From the documentation:

    const T = Type.String()     // const T = { type: 'string' }
    type T = Static<typeof T>   // type T = string

The main thing we're using it for now, is defining an OpenAPI schema and using the types in the front-end and backend, and using the schema to validate requests and responses, both client-side as well as server-side. Validation is done using avj [1].

We used Joi before and want to replace that code with JSON Schema, but that is quite the hassle. I like that TypeBox uses JSON Schema underneath, so migrating to another library should be easy.

[0] https://github.com/sinclairzx81/typebox

[1] https://ajv.js.org/


For those wanting to infer Typescript types from JSONSchema itself there is: https://github.com/lbguilherme/as-typed


Nice! I have been happily using json-schema-to-ts for this exact thing.

And if you're doing code generation, there is also json-schema-to-typescript


I recently used a package that converts typescript to JSON Schema, which I thought was pretty nice.

I get the legibility (and IntelliSense) of typescript interfaces but can make use of the all the validation libraries that use JSON Schema.


Even more minimal one I made years ago:

https://wryun.github.io/yajsonschema/


I'm fan of point-free combinators [0] [1] which compose in intuitive way.

[0] https://github.com/appliedblockchain/assert-combinators

[1] https://github.com/preludejs/refute/


I like the idea. Writing JSON schema by hand is cumbersome. I would like to able to generate JSON schema from this.


Just to be sure: You still find it cumbersome with an editor like VSCode where you get autocompletion for pretty much all the stuff json-schema can do?


Yea. I now use JSONNET to generate my schemas because of it oftentimes.


Unfortunately excessively verbose and non-portable to backends other than Node.js.

This is a good start, and what problems does this solve IRL? As in, actual problems that happen? (Not just theoretically possible classes of problems.) I get the narrowest use-case, and then beyond that this seems otherwise doomed to become irrelevant over time rather than gain momentum and become something meaningful in any lasting sense.

Think bigger.

Then the meta solution will become self-evident.


> doomed to become irrelevant over time rather than gain momentum and become something meaningful in any lasting sense.

You seem to be jumping from topic to topic without being able to express your thoughts very well.


What do you mean by excessively verbose? They seem to make a point at being minimalist and the comparison with JSON schema in the README page seems to attest to it.


Compared to a struct in C, Rust, Golang, or Java it takes more key presses on behalf of the programmer and only works in (/is optimized for) a narrowly specific set of circumstances.

Convenient in some ways (e.g. if you are only comfortable in JS), and does nothing useful in other contexts.

Nothing inherently wrong with it, other than it seems like a missed opportunity.

Is this more than a subset of Wordnik's Swagger?

https://swagger.io/


> Compared to a struct in C (...)

That's hardly a relevant point or meaningful comparison. The discussion is about schema validators for TypeScript/JavaScript.


Why limit to only the (frontend | Node.js)?


Swagger uses JSON Schema to describe the data formats.

https://swagger.io/docs/specification/data-models/keywords/

This is a easier write, easier to read, compatible with JSON Schema alternative. Spartan Schema could be used instead of JSON Schema to make the document easier for humans to work with.


> At the core, is this more than a JSON-schema generator bundled with a check function? Could it be >?

> Then the meta solution will become self-evident.

> Think bigger.

Are you going to share your big idea?


Avro? Protobuf?


Those are not in any way full alternatives to JSON schema


At the core, is this more than a JSON-schema generator bundled with a check function? Could it be >?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: