Ultra-minimal JSON schemas with TypeScript inference

jitl · on July 28, 2022

After some frustration with the TypeScript schema library ecosystem, I've decided that I'd prefer to declare my types using TypeScript's excellent type syntax, so I can take advantage of generics, mapped types, etc. Then, I'll take those TypeScript types and compile them to whatever alternative schema format I need.

There are many libraries that claim to convert your Typescript types to other formats, such as ts-json-schema-generator, ts-to-zod, or typeconv/core-types-ts. These libraries work by interpreting the Typescript AST, essentially re-implementing a bare-bones type system from scratch. Most do not support advanced Typescript features like generic application, mapped types, or string literal types. So, what's the point? To avoid those limitations, I use Typescript's first-party ts.TypeChecker API to analyze types, and an existing library called ts-simple-type (which I've forked) to convert from ts.Type to a more usable intermediate format. Then, I recurse over the intermediate format and emit "AST nodes". It's pretty simple, but seems promising.

So far, I have compilers from TypeScript type to Python 3 and Thrift. But I plan to add OpenAPI/JSONSchema, Protobuf (Proto3), Kotlin, Swift, and maybe Zod and Avro. Each target language is around ~300 LoC so they're pretty easy to put together.

My ultimate goal is to use this toolkit with Notion's internal types, which are quite a bit more complex than something like ts-to-zod can handle.

Repo: https://github.com/justjake/ts-simple-type

Compiler input and output: https://github.com/justjake/ts-simple-type/blob/jake--compil...

Thrift compiler: https://github.com/justjake/ts-simple-type/blob/jake--compil...

Python compiler: https://github.com/justjake/ts-simple-type/blob/jake--compil...

eropple · on July 28, 2022

> Most do not support advanced Typescript features like generic application, mapped types, or string literal types. So, what's the point?

I use @sinclair/typebox and the point is that I can get a JSON Schema out the other side, usable in-process and as a communicative description of what my code expects, just by evaluating my code and printing out an object.

It does support string literal types, FWIW. I'm good with it not supporting generics, because the places where I need an interchange format generally don't. Mapped types would be nice but aren't really critical to where I need to generate interchange formats; dependent types that result from transforming these can use mapped types, so it's close enough.

For my money, typebox has far-and-away the best user experience of any of the flavor of libraries you described, and IMO is probably worth studying.

jitl · on July 28, 2022

I agree that the runtime-validador-with-type-inference library genre is generally enough for API surface area. We have a similar internal library, including JSON schema output, and it’s serviceable for checking our public & private API incoming requests.

It does come with some costs. First, the type inference using these large mapped types (at least in our implementation and/or scale) noticeably slows down our Typescript type checking; the public api files with our runtime type declarations always top our build profiling. Second, it’s annoying to need to re-implement an existing type as a runtime-type if you start to need it in an API specification. There’s a tension in the codebase between using the easily-available, language native type system, or making do with the runtime type stuff which increases verbosity/noise and decreases expressiveness.

We have thousands of lines of types, including many types inferred from `as const` literals, written in Typescript. Rewriting those types in another schema language, such as typebox or Protobuf, would take a bunch of time and spread that tension from an isolated spot (backend API handlers) into every part of the codebase. It imposes new constraints, and requires devs to learn more things.

The libraries I listed and the compiler I’m building try to address this issue by supporting the native typescript declaration syntax. No more red types and blue types, much better gradual adoption pathway, and ideally one less thingy to learn for most developers.

gavinray · on July 28, 2022

Wow, holy smokes. This is incredible, thanks for posting.

brodo · on July 28, 2022

Looks promising. Thanks for sharing. This also needs a CLI. I was thinking about building something like this in Rust using swc.

jitl · on July 28, 2022

Swc will give you an AST, but you’ll still need to re-implement mapped types, type inference, template string types, etc - a large part of the TS type checker. Although, the author of swc is supposedly building a full type checker.

> This also needs a CLI.

I’m not planning on implementing one at the moment. The project goal is to greatly expand the use-cases for Typescript types by making it much easier to build custom compile targets.

A CLI means directing much more of my energy to configuration logic and file system IO. As a library, it’s clear how users should “configure” behavior - write or subclass a compiler target and implement whatever you want!

brodo · on Aug 3, 2022

Ah - I see. I've started to build a CLI around it and adopted the `SimpleTypeCompiler` example to build simple JSON schema files. I'm struggling with the API a bit, but I think I can manage.

vlovich123 · on July 28, 2022

I don’t know. Zod feels more reasonable path forward to accomplish the same thing (granted it’s not using the language of JSON schema if you have a dependency on that for some reason)

girvo · on July 28, 2022

Zod is fantastic. Our whole front-end is built around it, with some React hooks that wrap up a `createApi` call around Zod decoders and expose them to components/containers. It's lovely. Only downside is those decoders can't be re-used across other languages, whereas technically this could be, I guess?

evv · on July 28, 2022

Zod is great, but JSON-schema is my pick because it is easily serialized. So a client can query for the schema of the data it should send or receive, including descriptions/metadata that is useful for presenting a UI.

vlovich123 · on July 28, 2022

Zod to JSON schema is a thing so I think that solves your problem, no?

https://www.npmjs.com/package/zod-to-json-schema

evv · on July 28, 2022

I didn't know about this!

But on the client I would use a JSON-schema validator anyways, otherwise I'd need something like "json-schema-to-zod"? Seems like a roundabout way to address my problem

tmerse · on July 28, 2022

You happen to know if there are libs/utils to achieve the reverse (JSON-schema -> zod)? Could be useful for one of my usecases (form validation via rules recievd as JSON)

dgb23 · on July 28, 2022

Does zod let me walk/parse the definitions as data?

jitl · on July 28, 2022

The last time I tried this, I remember feeling sad and needing to wrap Zod in my own layer of stuff.

vlovich123 · on July 28, 2022

I can’t recall if it’s particularly ergonomic, but technically I believe it’s possible.

seanwilson · on July 28, 2022

I wish there would be an official TypeScript solution for this. I know there's a TypeScript goal that TypeScript types should not impact runtime behaviour, but I think converting arbitrary JSON from a file or network to a typed TypeScript object is such a common use case it needs a standardised solution. There's tens of libraries that try to solve this now in different ways, all working around this area where TypeScript doesn't try to help.

jitl · on July 28, 2022

Deepkit is quite interesting in this area: https://deepkit.io/

It reimplements the type system as… a stack based VM that expands the TS typedefs if needed at runtime?

I would use it if the risk factor wasn’t so high. I’m too scared of needing to maintain their crazy cool stuff to use it.

I would love Microsoft to build those APIs first party and officially support them.

EDIT:

> I know there's a TypeScript goal that TypeScript types should not impact runtime behaviour

This is why I’m rolling up my sleeves to build my own typescript-to-X tooling - it seems like no one is gonna do it for me, just the way I like — so I should make it easy for everyone to do it the way they like.

mirekrusin · on July 28, 2022

To be honest it doesn't even have to do anything during runtime per se.

It just needs to have nice macro system.

It would solve this problem and others like ie. pattern matching.

jitl · on July 28, 2022

there's always https://www.sweetjs.org/ that you can plug into your Babel pipeline

mirekrusin · on July 29, 2022

Sweetjs is unfortunately dead [0] for like 5 years now.

It also doesn't have any typescript awareness which is required to build this kind of functionality - you want to have static type introspection available in macros so you can generate code based on provided types.

[0] https://github.com/sweet-js/sweet-core/graphs/contributors

lifthrasiir · on July 28, 2022

I made a similar thing back in time, unfortunately not a public code though. The idea itself is simple, it was approximately something like this:

    const kind = Symbol();
    type Spec = "unknown" | "string" | "number" | ... |
        { [kind]: "optional"; spec: Spec } |
        { [kind]: "array"; element: Spec } |
        { [P in string | number | symbol]: Spec } | ...;

    function optional<S extends Spec>(spec: S): { [kind]: "optional"; spec: S } { ... }
    function arrayOf<E extends Spec>(element: E): { [kind]: "array"; element: E } { ... }
    // and so on, you get the idea. these helper functions exist so that
    // you can write `optional(...)` or `arrayOf(...)` instead of the full specification

    type Validated<S> =
        S extends "unknown" ? unknown :
        ...
        S extends { [kind]: "array"; element: infer E } ? Array<Validated<E>> :
        S extends { [P in string | number | symbol]: Spec } ? { [P in keyof S]: Validated<S[P]> } :
        ...;

    function validate<S extends Spec>(value: unknown, spec: S): asserts value is Validated<S> {
        if (typeof spec === "string") {
            switch (spec) { ... }
        } else {
            switch (spec[kind]) { ... }
        }
    }

The actual implementation of course had to take care of implicit nulls and other caveats of the TypeScript type system.

iainmerrick · on July 28, 2022

This looks really nice as an alternative to JSON schema, which is horribly verbose. Like using RELAX NG compact syntax for XML schemas back in the day, versus horrible verbose DTDs or XSDs.

But I think the most natural way to express JSON schemas would be TypeScript type declarations. Is there a project that can take a TS type and generate a runtime parser/validator for it?

I thought maybe Quicktype (https://quicktype.io/) was that, but it looks like it only takes JSON as input, not TS types.

xmonkee · on July 28, 2022

YES! I've introduced ts-to-zod into my codebase only yesterday, and it's served my needs exactly. It doesn't handle generics or mapped types, but that's a fair price to pay.

It also works the other way, you can define a zod schema and you can infer types for your data from it.

https://github.com/fabien0102/ts-to-zod

hebrox · on July 28, 2022

We're using TypeBox [0]: you use TypeScript to create JSON Schema's and it gives you a compile-time type.

From the documentation:

    const T = Type.String()     // const T = { type: 'string' }
    type T = Static<typeof T>   // type T = string

The main thing we're using it for now, is defining an OpenAPI schema and using the types in the front-end and backend, and using the schema to validate requests and responses, both client-side as well as server-side. Validation is done using avj [1].

We used Joi before and want to replace that code with JSON Schema, but that is quite the hassle. I like that TypeBox uses JSON Schema underneath, so migrating to another library should be easy.

[0] https://github.com/sinclairzx81/typebox

[1] https://ajv.js.org/

varanauskas · on July 28, 2022

For those wanting to infer Typescript types from JSONSchema itself there is: https://github.com/lbguilherme/as-typed

evv · on July 28, 2022

Nice! I have been happily using json-schema-to-ts for this exact thing.

And if you're doing code generation, there is also json-schema-to-typescript

Yaina · on July 28, 2022

I recently used a package that converts typescript to JSON Schema, which I thought was pretty nice.

I get the legibility (and IntelliSense) of typescript interfaces but can make use of the all the validation libraries that use JSON Schema.

wryun · on July 28, 2022

Even more minimal one I made years ago:

https://wryun.github.io/yajsonschema/

mirekrusin · on July 28, 2022

I'm fan of point-free combinators [0] [1] which compose in intuitive way.

[0] https://github.com/appliedblockchain/assert-combinators

[1] https://github.com/preludejs/refute/

brodo · on July 28, 2022

I like the idea. Writing JSON schema by hand is cumbersome. I would like to able to generate JSON schema from this.

dgb23 · on July 28, 2022

Just to be sure: You still find it cumbersome with an editor like VSCode where you get autocompletion for pretty much all the stuff json-schema can do?

brodo · on July 28, 2022

Yea. I now use JSONNET to generate my schemas because of it oftentimes.

metadat · on July 28, 2022

Unfortunately excessively verbose and non-portable to backends other than Node.js.

This is a good start, and what problems does this solve IRL? As in, actual problems that happen? (Not just theoretically possible classes of problems.) I get the narrowest use-case, and then beyond that this seems otherwise doomed to become irrelevant over time rather than gain momentum and become something meaningful in any lasting sense.

Think bigger.

Then the meta solution will become self-evident.

the_gipsy · on July 28, 2022

> doomed to become irrelevant over time rather than gain momentum and become something meaningful in any lasting sense.

You seem to be jumping from topic to topic without being able to express your thoughts very well.

skywal_l · on July 28, 2022

What do you mean by excessively verbose? They seem to make a point at being minimalist and the comparison with JSON schema in the README page seems to attest to it.

metadat · on July 28, 2022

Compared to a struct in C, Rust, Golang, or Java it takes more key presses on behalf of the programmer and only works in (/is optimized for) a narrowly specific set of circumstances.

Convenient in some ways (e.g. if you are only comfortable in JS), and does nothing useful in other contexts.

Nothing inherently wrong with it, other than it seems like a missed opportunity.

Is this more than a subset of Wordnik's Swagger?

https://swagger.io/

arinlen · on July 28, 2022

> Compared to a struct in C (...)

That's hardly a relevant point or meaningful comparison. The discussion is about schema validators for TypeScript/JavaScript.

metadat · on July 28, 2022

Why limit to only the (frontend | Node.js)?

codewithcheese · on July 28, 2022

Swagger uses JSON Schema to describe the data formats.

https://swagger.io/docs/specification/data-models/keywords/

This is a easier write, easier to read, compatible with JSON Schema alternative. Spartan Schema could be used instead of JSON Schema to make the document easier for humans to work with.

codewithcheese · on July 28, 2022

> At the core, is this more than a JSON-schema generator bundled with a check function? Could it be >?

> Then the meta solution will become self-evident.

> Think bigger.

Are you going to share your big idea?

bradwood · on July 28, 2022

Avro? Protobuf?

dtech · on July 28, 2022

Those are not in any way full alternatives to JSON schema

metadat · on July 28, 2022

At the core, is this more than a JSON-schema generator bundled with a check function? Could it be >?