Hacker News new | past | comments | ask | show | jobs | submit login

With respect, I fail to see how a public website is a privacy matter.



Information on a public website is public until it is taken down or the information changed. The Internet Archive removes an individuals control over when the information remains public. This is privacy. We might be caught naked, and we can't unsee what has been seen, but it is a basic human instinct to draw the curtains and contain further damage. Perfectly innocent individuals suffer because the IA rules are designed around edge cases where public figures try to hide misdeeds.


If you print a magazine you also don't get to recall all copies if you change your mind about something. Giving individuals this kind of control over other's ability to freely share information is dangerous because it is easily abused to hide information that is in the public's interest and that is not an edge case at all. Making a decision to publish something on the public web is hardly analogous to being caught naked even if you may come to regret either.

If anything, the IA should be more reluctant to remove information without a court decision.


> The Internet Archive removes an individuals control over when the information remains public.

And that's a good thing in the vast majority of cases. Unless we're talking about sensitive information that was published without the consent of the person in question, all public information should remain public forever.


In my experience, it is the vast minority of cases. Most of the content of the IA is not in the public interest, now or in the future. It is crap. It is noise. It is the contents of the Internet at a point in time. Actual information is the wheat in the chaff, and why you need search engines to find it. We know this, because of the Usenet archives that are intermittently available. Almost completely useless apart from people having a giggle at how the Internet used to be, a quick browse and search for naughty words. And a few gems in the mountain of noise, in such dire need of curation people hardly know it exists and barely justifiable enough for libraries to keep it alive.


Agreed, bulk collection gets dominated by crap, which individually has little value.

But there's some absolutely essential priceless diamonds hidden in the crap. And they can't be found/known at the time of collection: only with the future development of other events & knowledge do they become retroactively evident. So you've got to collect & preserve as much as you practically can, or else great things are lost forever.

Further, even the mounds/magnitudes of crap can turn out to be important for understanding the past. Ads that annoyed readers at the time help communicate how people, & businesses, & technology were really operating – not just the self-serving stories people craft later. The most-fumbling and awkward early uses of a new medium – hypertext, or RealAudio, or Shockwave Flash, or whatever – reveal enduring lessons about the evolution of technology & culture, including roads-not-taken that could still hold promise.

This shouldn't surprise us. Much of what we know of past civilizations comes from archeologists studying trash dumps that, via dumb luck, were well-preserved.

So if you tell me, "the Wayback Machine is a giant unedited trash heap of the internet", my response is: "Yes! That's the point! You get it!"


Some people discover much too late that there are some things they wish they could take back. Often before trying to get a better job or when trying to escape an abuser. Given the ramping up of attacks (legal and otherwise) on queer people, this is going to be a huge issue over the next decade or so.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: