The internet is self-cleaning. Materials that no one cares to maintain rots away.
It’s more like a city than a library. Houses need to be maintained and used or they eventually become unusable and will be torn down and replaced or simply abandoned.
I think some early thoughts about the web was that it was going to be a library of knowledge. Those ideas were misguided I think. It’s a chaotic city and it probably always will be. The article makes some good points about how to increase longevity and while it can certainly help, it probably won’t make a huge difference in most cases.
Underrated insight. I believe it parallels the natural lifecycle of materials inside libraries: content gets retired at all times, while still largely maintaining appearance of "everything at your fingertips".
I assure you, books and libraries need maintenance too. The old books has to be reprinted if you want to keep using them as actual books (e.g. read) instead of museum exhibits.
I also switched my whole site to pure html and css, but it depends on Bootstrap, which maybe is something I should remove and only add the classes I use.
IMO this feels like a somewhat random brainstorm about how to keep an older vision of the web. For example, why focus on behind-the-scenes dev process? What % of indie sites die due to dev build process? Very hard to know and Jeff doesn't provide any edge on insight over this problem. Without knowing that, focusing on things like whether to use git (???) and whether you compress your site... seems confusing.
> And how should your version control that file? Should you use git? Shove them in an 'old/' folder? Well I like the simple approach of naming old files with the date they are retired, like index.20191213.html.
Anyways, this appears to be a pitch for a project:
It's a fair critique. But I think the maintenance and dependencies for a website are the main reason a lot of them die off (basically, the owner giving up on it because the cost to keep it up is higher than the value it provides the owner). Intentionally killing a website seems rare to me otherwise.
> whether you compress your site
It's actually a comment against minimizing the html/css content, which makes it unreadable and adds an extra step in your toolchain (another point of failure).
> Anyways, this appears to be a pitch for a project
I did add that link recently as it's related, but I wrote this Designed to Last article 2 years before irchiver. See the previous discussion at https://news.ycombinator.com/item?id=21840140
irchiver author here. So I investigated for a long time whether a browser extension is possible for automatically screenshotting a page. It turns out it's not because of the limitations of browser extensions themselves, and from the way WebExtensions are going, it will never be.
I've only found a way to do this on Windows right now (since Windows 8.1) due to an undocumented flag that lets you capture images from the browser even with hardware acceleration on (which is the default). Though I trust Microsoft to maintain backwards compatibility even for undocumented features indefinitely.
I agree with the general sentiment of the page, with one exception: It’s better to use WOFF webfonts for the fonts on the page instead of using what were “Web safe” fonts back in 2005 or so.
With the rise of Android phones which, by and large, do not have licenses for the old 1990s “Microsoft Core Fonts” (Verdana/Gerogia/Arial/etc.), doing something like “font-family: Georgia, Serif” no longer guarantees a consistent look from platform to platform. What I use is a heavily optimized and subsetted web font stack which is placed in a single 114KB CSS file separate from the web page; I use inline fonts instead of linking to individual font files to remove the number of pages needed to load to render my page. I use WOFF for maximum browser compatibility, although this is becoming a non-issue with Microsoft seriously sunsetting IE11.
This way, my webpage will look the same in any web browser with WOFF support, regardless of what fonts a given phone or OS use for “Georgia” or what not. In addition, with the exceptions of Arial and Georgia, the Microsoft Core fonts are quite dated looking; using Verdana here in 2021 gives the page a “made in the first 2000s decade” look to it.
I also, for browsers without WOFF, have a fairly long font stack, so that the web page looks reasonable even without a webfont.
I wish browsers came with some kind of configuration panel for fonts.
All the font control I wish to give websites is the choice between serif and sans-serif. Let web fonts be opt-in rather than opt-out.
Let the browser pick a good one and be done with it, no need for every website to transfer an entire font file to my computer. Browsers already used to have some kind of control for this, though on mobile you don't really see it often anymore.
Designers seem to put a lot of value in fonts but I honestly don't care. Most of the fancy fonts decrease readability and just serve some kind of desire by the designer to have a certain brand. I don't care about your brand because 99% of the time I won't ever come back to your website and even if I do, it won't be because you picked out some bespoke almost-Arial-but-not-really font.
Reading through the discussions about those system font stacks, a couple of observations I have made myself:
* Linux desktop fonts are a mess. Two examples: “Helvetica” on Linux often times points to a Ghostscript font with vertical alignment problems when used in a browser, and the “Roboto” font for Android can use Linux rendering problems as the GitHub team discovered.
* Chrome has had significant font rendering issues. GitHub’s designer mentions subtle kerning issues with the San Francisco font in Chrome on MacOS; I’ve seen some pretty serious font rendering issues with Chrome in Windows, where fonts end up being about 10-20% lighter.
* Very little consideration has been given to how the system font stacks look on Android phones, who are about 40% of the web browsing public.
By hosting the fonts on the web page [1], we eliminate these problems. The Linux and Android font rendering mess is resolved by just providing our own fonts [2]; I have dealt with the Chrome rendering mess by very carefully testing the fonts in Windows on a low-resolution 75dpi display until they looked as good as possible on this “lowest common denominator” display [3], while looking good in other browsers and on higher resolution displays. MacOS, nicely enough, always renders the fonts nicely, even on older Macbook Airs and external 75dpi displays.
Another thing: All of the fonts are OFL open source fonts, and, as a matter of principle, I prefer that no proprietary fonts are used to display my web page.
[1] Yes, 114kb is huge for late 1990s web designers who had to push everything over a dialup link, but it’s a drop in the bucket these days, even with cell phones with 3gb/month on low end plans. I went to a lot of effort with Zopfli and very aggressive subsetting to make the font CSS file as small as possible.
[2] There’s an issue with the % on Linux, since Linux insists on using its own hinting, resulting in one of the bowls being taller than the other bowl. Besides that, the font looks great in Linux.
[3] I spent days tweaking ttfautohunt settings for the font with Chrome/Windows as my reference, and made the font a little thicker by using Charis SIL for body text instead of the smaller in size but thinner Bitstream Charter.
Another issue: To pull off using system font stacks, our web page will look different in MacOS vs. Windows vs. Linux vs. Android vs. CromeOS vs. etc. This makes testing a lot more time consuming. With webfonts, as long as we use good fonts which work with the lowest common denominator (75dpi Windows displays using Chrome, in my experience), we don’t need to add a new testing platform each time a new OS starts to gain traction.
I wish we could, with Android everwhere, have Linus Torvalds, Sergey Brin, Bill Gates, and Steve Jobs get together and say “OK, we all need to agree to have both a serif and sans font which is on every operating system out there, so that web pages don’t need to download fonts to their users any more.” and then have everyone present say something like “OK, we’ll make Noto Sans and Noto Serif always be available”, or barring that, have Sergey Brin say “OK, we’ll make sure Android phones always have Arimo (sans serif with the same metrics as Arial) and Tinos (serif with the same metrics as Times New Roman) available to use”.
If they could do this, then font stacks could simply be “Font-family: "Noto Sans", sans-serif;” or “Font-family: "Noto Serif", serif;”, or, the not as good but still usable “Font-family: Arial, Arimo, "Liberation Sans", sans-serif;” and “Font-family: Tinos, "Liberation Serif", "Times New Roman", serif;” font stacks. This would solve the pesky “download the fonts” problem.
Another solution would be for Bill Gates to say “I will release the 2002 Core Fonts version of the Georgia typeface under the OFL open source license if Brin will install it on all Android devices moving forward” or better “I will release Cambria and Calibri under the OFL open source license if Steve Jobs and Brin agree to install this font on all Apple computers, iPhones, and Android phones moving forward”
Of course, in the real world, it would logistically be easier to get Steve Jobs to show up alive and well at this meeting then to get all attendees to agree on a font which all operating systems will have.
Another place where it might actually be possible to pull this off is to have all browser makers, especially Android browsers to include Arimo and Tinos with the browser moving forward. Actually, it looks like Arimo/Tinos are already on most Android phones: https://help.xara.com/article/440-web-safe-fonts-on-android-...
Edit: Based on testing, Arimo is on my somewhat older Android phone, with the same metrics as Arial on my Windows 10 desktop machine, but, annoyingly enough, while Tinos also looks to be there, it doesn’t have the same metrics as Times New Roman on my Windows 10 desktop machine. So, yeah, “Font-family: Arial, Arimo, "Liberation Sans", sans-serif;” is pretty much the only font stack we can use and be guaranteed to have the same metrics across mostly everything.
I can squeeze in just the regular and italic text of a serif font in 53,512 bytes, so we can get a consistent looking serif across platforms at the cost of 52.26KiB, and then just use Arial/Arimo (the only across-the-board websafe font stack where we can retain font metrics) for sans text.
Why is looking the same what you want? Future fonts will probably suit future changes in display technology (e.g. the shift back to serifs as screen density has increased) and/or future shifts in written language (compare blackletter or Fraktur style to today's scripts). If you want your website to be readable in the future, letting future platforms display it with their default font seems far more dependable than trying to control how it looks pixel-by-pixel.
Serif fonts have looked more or less the same since 1465. [1] Blackletter fonts didn’t last very long, because they were not very legible, and were quickly replaced.
Sans Serif (Grotesque) for formal printed material is newer, probably around 200 years old, but is also widely established. Regardless, on my personal website, I use serif lettering for most text, because while it’s slightly less readable at 75dpi, retina displays are common enough (e.g. pretty much any phone in use today has retina level resolution) that serif text makes more sense. [2] Also, the only open source font the great Matthew Carter (Verdana, Georgia) has made is a serif: Bitstream Charter and its derivative Charis SIL.
The English language is much more likely to change to the point today’s English can only be read by linguists and researchers before typefaces undergo any significant change.
I have been on this whole journey of plain-text, longevity, legacy, and similar ideas. I recently moved my 20+ year old blog[1] to a plain-text (Markdown) and will continue to make it simpler that it should continue to last as long as it can be served out of the simplest hosting possible.
HTML in its raw form is very hard to read. I wanted something simple enough but still has some formatting[2] -- Markdown did that while I avoid spicing it up with formatting.
Right now, it is on Github and if needed, my 13-year old can edit directly there and it will update my website.
I hope that in few years, I shouldn't have to rely on Github either but, hopefully, decentralized that it is somewhere/everywhere on the Internet that it will live it as long as my domain is renewed.
I agree with the original markdown git repo being what's important. I treat the whole web stack as an implementation detail for rendering the markdown I wrote, so I've set my blog up as just serving a minimal header and footer to render that markdown as a web page (using strapdown.js), so that it's still fairly readable in raw HTML form and you can read it with telnet in a pinch. IMO that's a lot more valuable than worrying about progressive enhancement or no-script functionality when your page still requires a hugely complex CSS/HTML engine to render it.
The obvious alternative advice is if you want it to last, keep working at it, keep maintaining it, keep upgrading it. Don't spend a bunch of time on work that is meant to be abandoned, which to me implies that it's not that meaningful. The OP's message has a kernel of general truth in it, but if I interpret it in the context of CS academic research, the much larger problem is that the vast majority of CS projects are published and immediately abandoned.
> But the solution needs to be multi-pronged. How do we make web content that can last and be maintained for at least 10 years?
I've a boring solution to add to this list to address hosting. It's boring, dependable, and sadly no one will likely ever use it:
Encourage users to download your web pages.
The skeuomorphic label of "bookmark" really doesn't fulfil it's metaphor. Books don't change when you insert bookmarks, you can keep them around for decades, fingerprints, underlines and all - revisit them later in life and no external dependencies beyond keeping a roof over them will affect those properties. The nature of the web as hosted on remote servers is a moving target, "bookmarks" are merely links on a clock waiting to be broken, it's inevitable.
Making lean downloadable front end content is not hard (for the vast majority of content), but unpopular, convincing authors to do it is hard. Harder still is convincing a society spoiled by the convenience of search engines to embraces such an antiquated idea: downloading content for the value of some guaranteed, localised permanence. Perhaps the best chance we have outside of more radical "IPFS" ideas, to extend the existing web, might be to simply build this into the browser, probably a plug-in at first - something that changes bookmark into bookcase.
Browsers can already sort of "save" a web page, they just need to do it in a scalable way to try to accommodate the way most web pages are not built in a self contained manner. e.g back it with a content addressed file system or database. At which point it's feasible to turn this into a simple button without taking up 20MB for every page.
I like the sound of "Don't minimize that HTML". It'd be nice if the HTML of webpages was more transparent.
> Well, you don't save much because your web pages should be gzipped before being sent over the network, so preemptively shrinking your content probably doesn't do much to save bandwidth if anything at all.
Does anybody have empirical evidence on how much minify + gzip saves over just gzip?
One way to keep something alive for a long time is to make many copies of it in different places. What if every time a website linked to something static (like an article, puzzle, or tutorial), it saved a local copy and made a link to that copy available as well? More inbound links would mean more longevity instead of more fragility.
i share the same mindset as i really miss traveling but the only way to explore places to go when i’m abroad is yelp, google maps, and my own excel sheet. all of which are very slow on international sim cards or in remote areas. i’m thinking about building my own yelp-like site which is just static content on a cdn, images would by default show the “alt” text with an option to download the image.
They use Bitcoin to save hashes of the webpages in case they need to prove that they saved a page on a particular date (though I don't see how they could prove that the page they saved is the page that was actually online). That's a pretty sensible use of what are essentially NFTs.
Well, I hope Jeff didn't put too much time into finding the color pallet. I don't want to be negative, but this background color just makes we want to run away/close the page.
Maybe these links which I collected over the years can be helpful:
There's no fundamental problem with the color chosen, so those last two probably won't be of much use. Anyway, I have no problems with the background. Curious you had that visceral of a reaction
The orange color seems quite harmless, and the text is still black on white. Plus you can get the orange borders to go away by resizing the window. So I really don't see the problem.
It’s more like a city than a library. Houses need to be maintained and used or they eventually become unusable and will be torn down and replaced or simply abandoned.
I think some early thoughts about the web was that it was going to be a library of knowledge. Those ideas were misguided I think. It’s a chaotic city and it probably always will be. The article makes some good points about how to increase longevity and while it can certainly help, it probably won’t make a huge difference in most cases.