Hacker News new | past | comments | ask | show | jobs | submit login

> The Internet Archive removes an individuals control over when the information remains public.

And that's a good thing in the vast majority of cases. Unless we're talking about sensitive information that was published without the consent of the person in question, all public information should remain public forever.




In my experience, it is the vast minority of cases. Most of the content of the IA is not in the public interest, now or in the future. It is crap. It is noise. It is the contents of the Internet at a point in time. Actual information is the wheat in the chaff, and why you need search engines to find it. We know this, because of the Usenet archives that are intermittently available. Almost completely useless apart from people having a giggle at how the Internet used to be, a quick browse and search for naughty words. And a few gems in the mountain of noise, in such dire need of curation people hardly know it exists and barely justifiable enough for libraries to keep it alive.


Agreed, bulk collection gets dominated by crap, which individually has little value.

But there's some absolutely essential priceless diamonds hidden in the crap. And they can't be found/known at the time of collection: only with the future development of other events & knowledge do they become retroactively evident. So you've got to collect & preserve as much as you practically can, or else great things are lost forever.

Further, even the mounds/magnitudes of crap can turn out to be important for understanding the past. Ads that annoyed readers at the time help communicate how people, & businesses, & technology were really operating – not just the self-serving stories people craft later. The most-fumbling and awkward early uses of a new medium – hypertext, or RealAudio, or Shockwave Flash, or whatever – reveal enduring lessons about the evolution of technology & culture, including roads-not-taken that could still hold promise.

This shouldn't surprise us. Much of what we know of past civilizations comes from archeologists studying trash dumps that, via dumb luck, were well-preserved.

So if you tell me, "the Wayback Machine is a giant unedited trash heap of the internet", my response is: "Yes! That's the point! You get it!"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: