> When you find a page [...] and you want to make sure it's available to you later, what do you do?
Instead of doing a bad and lossy job of archiving the page myself, I notify† our friendly neighbourhood archivists at the Internet Archive of the page; and they then do the best, most lossless job of preserving the page that they're able, given their cumulative experience.
As a side-benefit, they also then take care of keeping the archive they've made around and available online in perpetuity, with no additional marginal effort on my part. The same can't be said for something in my own "private collection."
This may not be well-known, but archive.org can and does remove pages / sites from the archive. Authors can request this, site owners (separate from the authors) can request this. There may be others who can request this.
Just an FYI. If there are critical sites you want copies of, I'd recommend making your own copy. I've lost access to important pages / sites twice before taking this to heart.
There is value in having a personally curated, offline collection of documents. You can search, annotate or otherwise manipulate it to your heart's content, all without having to be connected.
Of course the Internet Archive serves other purposes for which it is (currently) irreplaceable.
Hopefully it really is around a very long time, but the world is unpredictable and things change. It's great to enhance the Internet Archive, but you can bet I'm keeping my local copy too. Just in case.
That's subobtimal as well. The site could come out with a new robots.txt file which is just
<code>User-agent: * Disallow: /</code>
and everything already indexed by the Internet Archive is now inaccessible to you.
I don't think I've ever had such a thing that only appeared as a web page, without being emailed to me. To me, the email is the primary-source document in that arrangement.
Instead of doing a bad and lossy job of archiving the page myself, I notify† our friendly neighbourhood archivists at the Internet Archive of the page; and they then do the best, most lossless job of preserving the page that they're able, given their cumulative experience.
† http://blog.archive.org/2017/01/25/see-something-save-someth...
As a side-benefit, they also then take care of keeping the archive they've made around and available online in perpetuity, with no additional marginal effort on my part. The same can't be said for something in my own "private collection."