Hacker News new | past | comments | ask | show | jobs | submit login
Interactive Docs with Markdoc (stripe.com)
157 points by tomger on Sept 15, 2022 | hide | past | favorite | 47 comments



I've always loved Stripe's interactive documentation, wondered how they did it and hoped they'd open-source it. I totally missed the news that they did back in May.

Now, it seems that 99% of documentation tooling advertised is Markdown based; and e-v-e-r-y time I see this on HN I wonder why AsciiDoc[0] isn't more prevalent than Markdown.

AsciiDoc is older than Markdown and out of the box supports things that Markdown doesn't, which has led to a proliferation of Markdown flavors that remind me the web browser experience of the 2000s.

It is Stripe's documentation that led me a few years back to look into replicating their documentation, particularly the interactive part of it. While I wasn't able to find any tool out there that did this out the box, that's how I found about AsciiDoc in the first place, and then Antora.

This tool, Antora[1], is based on Asciidoc and allows you to create amazing documentation websites. A while back I experimented and wrote a PoC of a plugin for Antora that made my documentation interactive.

I also believe that AsciiDoc is what O'Reilly authors use to write books (edit: it is one of 3 languages they [2]use: AsciiDoc, HTMLBook, or DocBook XML)

I wish AsciiDoc was more commonly used.

[0]: https://asciidoc.org/

[1]: https://antora.org/

[2]: https://docs.atlas.oreilly.com/index.html#what-is-atlas-DNIR...


Hey, I created Markdoc at Stripe. Back in 2017 when I first started exploring options for improving the authoring experience, my original prototype actually used AsciiDoc (via AsciiDoctor). I had a lot of trouble getting buy-in for AsciiDoc from the users, who universally preferred Markdown, and their feedback is what ultimately led me to create Markdoc instead.

AsciiDoc gets a lot of things right, but it also has a lot of syntactic complexity and idiosyncrasies that were challenging for our documentation contributors. In documents with structural complexity, it becomes unwieldly very quickly. For example, having to vary the length of the delimiter line when nesting delimited blocks[1] is pretty arcane and leaves a lot of room for errors to creep in when reordering pieces of content. Even though AsciiDoc's extensibility model was compelling to us, it's really not designed for the kind of deeply-nested hierarchies that we needed to be able to express things like our integration builder[2] UI.

Anyhow, we have a section in our FAQ[3] that provides more context about why we didn't choose AsciiDoc. I remain a fan of AsciiDoc even though it didn't fully meet our needs, and Markdoc was definitely influenced by the things that we thought AsciiDoc got right.

[1]: https://docs.asciidoctor.org/asciidoc/latest/blocks/delimite...

[2]: https://stripe.com/docs/payments/quickstart

[3]: https://markdoc.dev/docs/faq#why-not-asciidoc


I've faced the same issue regarding buy-in. For some reason, lots of users flat out refuse to use asciidoc. I think this reaction is purely anxiety around having to 1- learn something new (a new language) 2- to do something that most people find unpleasant (writing documentation).

> but it also has a lot of syntactic complexity and idiosyncrasies

I don't think that's true; for basic usage, which is probably 90% of what people use, Markdown and Asciidoc syntax are basically the [0]same (AsciiDoc is also compatible with Markdown's syntax)

For advanced use, Markdown syntax has to be augmented via one of the many "flavors" (i.e. no real advanced features out of the box) and these flavors usually mean typing raw HTML directly in the documentation.

Asciidoc advanced syntax on the other hand is available out of the box for whoever wants to use it, but you don't have to use it.

[0]: https://asciidoc.org/#compare


AsciiDoc's flagship feature is the include directive (think partials). GitHub rendering doesn't support it. On GitLab you are limited to ~30 includes, after which your document just spits out source. So in most cases, you can't preview AsciiDoc files that use the include directive without cloning the repo and previewing it locally.

When you take away the include directive you have a less elegant syntax that is harder for contributors to read, much less adopt. Add to that a very small community and a lack of critical tools. Most glaringly, there is no linter for AsciiDoc.

AsciiDoc just doesn't have advantages over Markdown (including partials, as Markdoc and other tools used by IBM and Microsoft allow them). Meanwhile Markdown renders everywhere, and it's highly legible as source.


> AsciiDoc's flagship feature is the include directive (think partials). GitHub rendering doesn't support it. On GitLab you are limited to ~30 includes, after which your document just spits out source. So in most cases, you can't preview AsciiDoc files that use the include directive without cloning the repo and previewing it locally.

Does Markdown support partials/includes, and does Github honor them? I don't believe it does... (I tried googling about it, didn't find anything, but I could be wrong) so not sure how this can be used as an argument against asciidoc.

> When you take away the include directive you have a less elegant syntax that is harder for contributors to read, much less adopt.

Markdown and AsciiDoc syntax are extremely [0]similar, and AsciiDoc supports markdown's syntax.

[0]: https://asciidoc.org/#compare


I've used both for a decade or more, and my opinion is that includes/partials is the only thing AsciiDoc offers that even justifies considering it as a markup language for docs. If (big if!) using includes/partials without React is so important you'd consider giving up all the benefits of Markdown.

IBM, Microsoft and Stripe all have tooling that enables Markdown includes/partials. None of those will render natively in GitHub or Gitlab - but if you really need that feature you probably don't care as the AsciiDoc support is also so poor.

I think if AsciiDoc and Markdown were in the same league from a writing and reading perspective, AsciiDoc might have more traction. Other comments in this thread support the case that it's just harder to write and read as markup.

A whole other topic is how Markdown has been adopted by React-driven systems. This just solidifies Markdown's position as the lingua franca.


I tried Asciidoc last time it got a shout out on here, possibly by yourself. On a totally personal level, I found it really ugly and clunky to write, like going from Python to PHP.


Markdown (with flavouring, plain one is just too poor) still IMO looks better in its text form, which was main point of it, ASCIIdoc looks a bit worse as pure text (altho not bad).

But it is probably just plainly "it happened that Big Things adopted it, so more Big Things saw it and also adopted it"


Markdoc syntax and capability looks so much like MDX (https://mdxjs.com). Can anyone who's evaluated Markdoc and MDX 2 comment?

I'm currently doing an architecture decision record about Markdown documentation, and will add Markdoc to the candidates. The leaders so far are MDX 2 with plugins for JSX-style work, and Svelte for a fully dynamic site.

I'm aware of the Markdoc page about "Why not MDX?" which explains that Markdoc is deliberately less capable than MDX. But the page doesn't show how to do typical needs (IMHO) such as loops or substitutions. And for simple writing, compare with standard markdown annotated with tags/templates using Liquid or Jinja or similar?


Seconding this, and if anyone is willing to add a performance angle in their assessment, that'd be great too.

I haven't evaluated Markdoc fully, but I did have some experience using both MDX 2 and markdown-it—the parser used by Markdoc. I can honestly say that MDX 2 (and remark for that matter) is a lot slower in parsing bigger markdown files. Anecdotally, I saw markdown-it outperforming remark by a factor of 20 in terms of generating the file's AST alone.

Markdoc with its added complexity may add some penalty to the parsing performance, but does it do so to a point where it gets considerably closer to MDX? I doubt that this is the case, but would love to see someone verify this.


(I'm an engineer on the Stripe Docs team!)

The primary difference between MDX and Markdoc is that an MDX article is essentially imperative code (like a typical React function containing JSX), whereas a Markdoc article is purely declarative. That makes it easier to reason about what's being rendered, and makes things like static analysis much more straightforward.

With MDX, you can essentially write arbitrary JavaScript code in articles. We had a similar situation when we used ERB, and it resulted in markup which was easy to write but very difficult to read. Markdoc strikes a better balance between the two in part because it has better guardrails and stronger division between content and code. It is a tradeoff, though.


In my experience, nothing beats mdsvex (https://mdsvex.com) when it comes to simplicity and power. It’s a staple in the Svelte community and authored by a core Svelte maintainer.

If I’m honest though, the one thing that can beat it for me is my guilty pleasure - pug in Svelte. If a better way to author web pages exists, I’ve never seen it!


Really cool of @koomen to donate the domain, from the github repo (https://github.com/markdoc/markdoc):

  Special shout out to:
 
  @marcioAlmada for providing us with the @markdoc GitHub org.
  @koomen for gifting us https://markdoc.dev.


Big yes! Huge thanks to @koomen for this.


Yet another company-specific reinvention of a wheel from somebody with too much money and time on their hands, and will be abandoned after a few years.

Documentation is information intended for humans to understand complex topics. Code is information intended for machines to understand complex topics. Code makes shitty documentation. Code is time-consuming, error-prone, nit-picky, and extremely verbose compared to the kinds of tools designed for humans: WISYWIG editors, buttons, windows, graphics, diagrams, audio, video.

Developers today have grown up in a world where doing anything requires writing code. They literally do not understand how to solve problems without writing code. And have even fooled themselves into thinking that writing code is a superior way for humans to solve a problem than clicking a button. So rather than have robust tools that solve our problems, we now only make tools that require us to create our own customized tools to solve our problems. Forever making slapped-together jigs for every project, rather than just buying and using a premade tool.

Confluence is what a documentation-creating solution should be. Not only does it have a WYSIWYG editor, it has custom plugins, rich content, and dynamic components, all of which can be added anywhere, immediately, with a live preview, with no training, no code, no syntax. Simply an interface for an average human to make a beautiful, useful document, quickly and easily.

From their page: "I like Markdoc because it lets us still do anything we want with code in the docs without bogging down the content authoring experience. If we need some new component, designers and engineers can whip that up. So as a writer, I can work in the docs content and stay focused."

You can do the same thing with Confluence. They just had Not-Invented-Here syndrome, and wanted to do something with their excess engineers.


You are aware that Confluence is a company specific, and proprietary, reinvention of a wheel (wikis)? And it's not even a particularly good one. It's mainly popular because it gets bundled with Jira by Atlassian. People use it because it is there; not because it is particularly good at what it does.

Your central thesis is that code and documentation are two very different things. Donald Knuth, who knows a thing or two about writing both documentation and code, invented literate programming because he disagrees with that. The key point of literate programming is that writing good code is a form of documenting your thought process for the benefit of other programmers who may have to maintain and understand your code.

Confluence is part of the reason I run an Atlassian free company. I've never been impressed with Atlassian tooling. Most of the tools I use instead support things like Markdown and some other sane things. It's fine as a poor man's word processor but it's a bit of a straight jacket. Don't get me started on Jira.


> You are aware that Confluence is a company specific, and proprietary, reinvention of a wheel (wikis)?

Am I aware that Confluence is a product created by a company for a profit in order to provide an advanced implementation of a design for specific use cases? Yes.

> And it's not even a particularly good one.

I have used 20 different wikis. None of them provide all Confluence's features or are nearly as user-friendly. Very few of them provide a robust API. Very few of them have as seamless a WYSIWYG, not to mention native Markdown support. Very few of them have advanced management capabilities. None of them have as many advanced dynamic content features, very few of them have marketplaces. It is hands-down the best tool for what it is designed for.

> Donald Knuth, who knows a thing or two about writing both documentation and code, invented literate programming because he disagrees with that.

Literate programming is ridiculous. The idea is to use a human language to tell a story and have that story be converted into computer code, with the idea that the human will understand the story better and somehow this will result in them being able to write better code. But that's stupid for two reasons: 1) writing a story is hard. communication is hard. most people just suck at it. the idea that a human will be automatically good at telling a story is in itself ridiculous. 2) humans just suck at writing code, no matter how well they understand its purpose. Traditional architecture is intended to formalize the purpose and function of the program and can result in perfectly adequate code, assuming the human actually learns how to do their job properly, which I admit is a pretty big ask for most people in tech today.

If you don't like Confluence and Jira it's probably because you haven't learned how to use them (or you just don't like that they're not free). They are incredibly flexible and extensible and the marketplace is full of tools to expand their functionality, even though the base functionality is already very powerful.


Maybe all those people who use plain text tooling for writing content, do it because... they prefer it over wysiwyg?

Confluence UX is horrible. There are a number of reasons for that, but just one is enough for me to reject it: you can't create or edit content in your own, personally optimized, automatable, text editing IDE. (And if you actually can, guess what? That's another argument against the usability of confluence, because I was either unable to find or unable to configure such a feature)

Obviously it's important for an average user to be able to edit content. But there are options that don't lock out the programmers who legitimately prefer plaintext editing.


I don’t disagree but Stripe’s case makes sense.

They are making documentation for developers.

And in fact, having good documentation and developer tooling is a core part of their value proposition.

Making it easy to look at documentation and immediately implement something that works is a core part of the Stripe UX.

So documentation can be considered as part of their product offering.

In that way it makes sense for them for it to be driven by code.


There's a dozen code documentation solutions out there. They could have adopted one that already handled Markdown and simply added extensions to it; instead they reinvented the wheel.


They sell an API product, of which documentation is an absolutely core, user-facing part of their product. It doesn’t make sense not to own something that differentiates your products value proposition.


Should probably mention mdx since it's a much better alternative that has been available for a while


Can you say more about why you think MDX is better?

For what it's worth, we considered MDX, but chose not to use it. Full explanation here: https://markdoc.dev/docs/faq#why-not-mdx


Speaking for myself, I'm tired of learning yet another templating language re-implementing basic features like if, else, and for loops when I could just use an existing language with a few additions. Learning HTML is pretty easy, even for non-engineers.

Doesn't this syntax:

    <callout type="check">...</callout>
Look better than:

   {% callout type="check" %}...{% /callout %}

?


I think the point is being able to then use markdown inside of that block, since many people seem to think that "**hello**" does look better than "<b>hello</b>". That being said, it would perhaps have been better to allow for HTML tags inside the markdown that themselves can have markdown inside of them (I'm not sure if this is the default behavior of markdown or not, or whether there are any weird parsing pitfalls in allowing this).

I think that perhaps there is also a contingency of people that have been using templates for ages that have the "{% %}" style, so this is maybe attractive to those folks?

To be clear, I agree with you, but I am just trying to figure out why this syntax would be chosen.

Edit: Perhaps there is also some sort of HTML injection argument against using real tags? That is to say, if you have an interface that allows users to input markdown, then with "{% %}" you can easily filter the allowable template tags, but perhaps it is just more error prone to try to handle "<>" tags that might then themselves get inserted into live HTML. I haven't thought it all the way through but just wondering if there is a non-stylistic argument for it.


ColdFusion fan? I still have a soft-spot for ColdFusion, even though like most other ColdFusion developers I moved on to both PHP and Java, and then to other pastures entirely. ColdFusion's embrace of Java spelled its demise, becoming lost in the swamp of enterprise complexity.


I didn't see this until after I posted a similar example. Yes, I agree 100%.


I haven't experienced the supposed concern of complexity with MDX. I have a docs site with 1600+ md pages. In practice I've exposed a small number of react components that doc writers can use and that's it. Sure in theory they could introduce new complexity, but they don't. Further, you can introduce new markdown primitives with remark that get transformed to react components if you want to hide some of that complexity from writers. For example, we have one that auto collapses adjacent fenced code blocks of different language into one react component with a built in language switcher.


Any pointers to code you could share?


Chase McCoy had a good note explaining it: https://chasem.co/2022/05/markdoc

Basically because MDX mixes JSX and markdown, you need knowledge of JSX/JS (which non-devs might not have), and tooling dedicated to build, parse it and so on. Markdoc is more of a "separation of concerns" approach.


Oh man, are we swinging back the other way again? lol


Which docs are built with MDX that you prefer to the docs at Stripe?


I've tried to implement documentation using Markdoc, but failed to understand what are the advantages that it brings over tools like mkdocs [0] and Material [1].

[0] https://www.mkdocs.org/

[1] https://squidfunk.github.io/mkdocs-material/


I've recently been using Markdown (to my utter chagrin) in order to update my resume and blog, and I'm just astounded by how much recreation of solved problems is going on in that space. I prefer JavaScript/Node so I've played with UnifiedJS (remark/rehype), MarkdownIt, MarkedJS, and others. It's honestly just absurd how over-engineered each project is.

If all you're producing is HTML - which is 99% of what Markdown is used for - having to deal with a raw AST in order to modify a document is like deciding to use Assembly instead of Python. Sure, it could be more efficient if you really want to spend the time, but it's generally a step backwards in every way.

I've personally decided that the only predictable, reliable and maintainable way to deal with Markdown is to extract the frontmatter, then convert the rest into bog-standard CommonMark HTML and then use JSDOM to do any additional manipulation. So instead of fighting with some wonky AST tree and APIs, I can use the DOM and standard web tools and code.

Markdown parsers will let you pass through HTML tags. You can define any tag you want. There is no difference between:

    {% callout type="check" %}
    {% /callout %}
and

    <callout type="check">
    </callout>
And anyone who went through the XSL trend of the early 2000s should know the long-term pain caused by putting logic in your documents.

    <xsl:if test="price = 10">
    </xsl:if>
Recreating this with {% if equals(1, 2) %} is just ignoring decades of lessons already learned.

Honestly, Markdown needs to be killed with prejudice. So much wasted time and effort getting it to do what people need it to do.


> Honestly, Markdown needs to be killed with prejudice. So much wasted time and effort getting it to do what people need it to do.

It seems to work pretty well as a human-editable format for rich text. Is there a different format you'd like to see take its place for that niche?


Asciidoc (see my other comment: https://news.ycombinator.com/item?id=32860262)


I looked and I didn't find anything.

What I would like to see is a new self-contained Web Document standard (none of the various implementations out there qualify) that mimics the core reason Markdown and other plain-text systems like AsciiDoc or LaTeX exist: To separate the writing from the presentation, but with some basic formatting as needed for most documents.

There are various self-contained document formats out there: ePub and mobi files use HTML inside, as does Microsoft's CHM. And there's a hundred zipped XML file formats out there - docx, odt, etc. But they're either write-only, proprietary or are too complicated for this purpose.

What I would want is a simple .wdoc standard file, which is either a plain-text or zip file containing a very strict subset of HTML and CSS which basically mimics the output of Markdown. (It could be called MarkUp, actually). The subset would be limited to just semantic tags and reasonable formatting, to guarantee editable HTML. Nothing dynamic or crazy. Just pure WYSIWYG.

If it was a W3C standard, there could even be a new HTML tag <doc><\doc> which wraps raw .wdoc markup in a sandbox, guaranteeing that nothing inside those wrapper tags will display anything but the allowed styles and tags. A zipped .wdoc (with images) could be included with a src attribute: <doc src="...">, with an "editable" attribute that defaults to false, but could be flipped to allow editing. Maybe an "allowed" attribute to limit formatting even further.

Like the video and audio tags, basic editor features could be supported natively, but would also allow custom editor skins like CKEditor, TinyMCE, Trix etc. But again, with standard output. This would be great for online forums like HN or reddit. In standalone apps, like Apple's Text Editor or Microsoft's WordPad, the output would be a cross platform rich text document that is readable and writable by any browser or standard .wdoc editor.

The idea is to Keep It Simple Stupid, but also provide basic cross-platform WYSIWYG editing where the simple, clean formatting is always displayed exactly like it looks when editing. I use Typora, which is a great little rich text editor that uses WebKit for the interface, and then exports Markdown, which I then process into a web page. It's insane. Let's cut out the useless middle step.

Browser engines have progressed so far since Markdown was created. It's all a matter of standardization at this point. Keep the spec simple and focused on just creating simple documents. If someone wants to use the output as a full-on web page, then it's just a matter of not using the <doc> wrapper and adding full-strength CSS, JavaScript, etc. The CommonMark spec could even be updated so that .wdoc is the standard output of a processed .md text file.

The web has tilted too far towards the dynamic app end of the spectrum, and lost its roots as a document format. I think something like this would be a great way to get back to that.


I like it.

Personally I've always preferred HTML tags to markdown, primarily because I can usually understand simple HTML by looking at it, while the markdown needs a cheatsheet if I haven't been working on it recently.

I think this could probably be done as a web component if the browsers didn't implement it directly.

Plus most GUI tools produce HTML and that could probably be modified to output a more restricted format.


Interesting to see another post on HN frontpage about Stripe's "incompetent" review team, and then see this one.


The one thing from the Stripe docs I like is the embedded API key in all the code snippets, but this seems difficult to accomplish with Markdoc, especially if you're rendering the whole site to static HTML.

For example:

  ```bash
  curl https://api.example.com/stuff \
    -u {% apiKey /%}: \
    -d "foo=bar"
  ```
If the user is logged in, I want the {% apiKey /%} to be replaced with something. Do I use a transform, node, or tag to accomplish it? It seems easier to write:

  ```bash
  curl https://api.example.com/stuff \
    -u MY_API_KEY: \
    -d "foo=bar"
  ```
And then have custom JavaScript that runs on the page that does a search-and-replace for the string MY_API_KEY with the user's API key (obtained async with API, etc.). The documentation on Markdoc needs work.


Your first example is essentially how it works on the Stripe docs platform. However, it requires a couple of custom components, including a CodeBlock component that renders code within code fences, and a component which understands how to display the user's API key. These are both very Stripe-specific, so it doesn't make sense to open source them.

If you render the site to static HTML, though, you will have to do something more like a last-minute search-and-replace. Instead of rendering raw HTML, you could render the entire site to either the Markdoc AST or Markdoc renderable tree, which are both serializable. That's the approach that I use on my personal site [0], which implements a system similar to ContentLayer. [1]

[0] https://github.com/nkohari/nate.io/blob/master/build/Content...

[1] https://www.contentlayer.dev/


that's something google cloud docs seem to do really well; their command execution examples have editable components that are applied to all similar examples on the page: https://cloud.google.com/compute/docs/instances/custom-hostn...

I wondered how this works and if it's FOSS.


We’re using markdown + liquid with custom liquid tags.

This looks so similar to that.


Hey, I'm the creator of Markdoc and the author of that blog post.

The key advantage of using Markdoc instead of using Markdown with liquid or another string-based templating system that preprocess the content is that the tags and other custom syntax are a first-class part of the Markdoc format. The document parses to an AST and the individual tags can programmatically manipulate the content and document node hierarchy instead of just manipulating or outputting strings of text that are passed on to a Markdown processor.


You may have misunderstood the parent post? Custom Liquid tags use code that can do anything you want: programmatically manipulate content, read databases, invoke APIs, autogenerate examples, run tests, use git, etc.


No, Segphault literally addressed why Markdown with Liquid is not the same as Markdoc on a processing level. “Custom tags processed by [thing that is not Markdown]” != “Markdown superset that makes custom tags first-class entities”. Either your template processor is run before the Markdown processor and outputs Markdown, or it’s run after the Markdown processor and outputs HTML. Neither one is doing what Markdoc is doing. Markdoc parses to an AST; “Markdown + [other thing]” does not. The difference may be immaterial in some use cases, including yours, but it’s still a difference.


Yes you're right there are differences. What I'm seeing is Markdoc custom tags and Liquid custom tags can both do any programmatic effects, ASTs, etc.

Perhaps an example may be useful? Or do you have an example where Markdoc custom tags are especially different/better than Liquid custom tags?

Markdoc custom tag syntax:

    {% bold %}
    Lorem ipsum
    {% /bold %}
Liquid custom tag syntax:

    {% bold %}
    Lorem ipsum
    {% endbold %}
Markdoc custom tag code is like this in JavaScript:

    export const bold = {
       render: 'Bold'
       …
    };

    import \* as React from 'react';

    function Foo({ children }) {
       return (
         <b>{children}</b>
       );
    }

    return Markdoc.renderers.react(
        content, React, {
            components: {
                Bold: Bold
            }
        }
    );
Liquid custom tag code is like this in Ruby:

    class Bold < Liquid::Block

      def initialize(_tag_name, _content, parse_context)
        super
        # Do whatever you want here,
        # such as parsing to your favorite AST,
        # or calling a DB, or RPC, or API, etc.
      end

      def render(context)
        # Do whatever you want here,
        # such as processing your favorite AST,
        # or using the rend context vars, etc.
        "<b>" + super.strip + "</b>"
      end

    end

    Liquid::Template.register_tag("bold", Bold)
    print Liquid::Template.parse(File.read("my.txt")).render




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: