I honestly can't believe all the praise for HTML and web on HN in the face of this awesome critique. I hugely appreciate the love for actual files.
>• PDFs are decentralised. You may have obtained this PDF from a website, or maybe not! Self-contained static files are liberating! They stand alone and are not dependent on being hosted by any particular web server or under any particular domain. You can publish to one host or to a thousand hosts or to none, and it maintains its identity through content-addressing, not through the blessing of a distributor.
This seems to have gotten lost in the offense everyone has taken over the choice to not use 'simple HTML', despite the document's clear reasoning that to do even that would embed the content deep in the 'urban web'. All of these simple-complex propositions about making some subset language or automating document flows are missing the point entirely.
> You can publish to one host or to a thousand hosts or to none, and it maintains its identity through content-addressing, not through the blessing of a distributor.
It kind of seems like you're describing IPFS, except with worse content addressing guarantees. The vast majority of your users will never check to see if a PDF's content actually match its content address.
> All of these simple-complex propositions about making some subset language or automating document flows are missing the point entirely.
Are they? It's really not that hard to build a self-contained HTML file, and to re-emphasize, signed PDFs and signed HTML files are about the same level of accessibility to most users. Web browsers don't really handle either, if you want those guarantees you need to use a protocol/technology with better support right from the start.
Also to be clear, despite the author's argument that PDFs can be self-contained, no browser guarantees that, and there's no way for me to tell if the PDF is self contained when I click on it in Firefox unless I download it and check it myself offline or in a viewer that guarantees it won't make network requests.
Nothing online that I'm aware of forces authors to use PDF/A, so when I download a PDF, I don't know what I'm getting. It's not actually the magical, re-hostable world that the author claims.
I'm not sure that people are missing the author's point so much as they're saying the author is making claims about the portability of PDFs that aren't necessarily accurate. Yes, it would be good to have better self-contained guarantees about some web-content, but I'm not sure PDFs actually supply any of those guarantees.
>• PDFs are decentralised. You may have obtained this PDF from a website, or maybe not! Self-contained static files are liberating! They stand alone and are not dependent on being hosted by any particular web server or under any particular domain. You can publish to one host or to a thousand hosts or to none, and it maintains its identity through content-addressing, not through the blessing of a distributor.
This seems to have gotten lost in the offense everyone has taken over the choice to not use 'simple HTML', despite the document's clear reasoning that to do even that would embed the content deep in the 'urban web'. All of these simple-complex propositions about making some subset language or automating document flows are missing the point entirely.