Medium tries to prevent people reading deleted articles on the Wayback Machine?

dorian-graph · on May 1, 2018

Over the last year I've come to trust effectively nothing on the internet. I've had so many Spotify playlists I was following disappear, whole artists, websites, articles, etc.

I'm slowly moving to offline-first versions of all the information I care about. Edit: This change too also lends to the 'slow web' (or just slow $whatever) movement, which I'm a fan of.

ryanSrich · on May 1, 2018

One of the first things I do with any article that I've enjoyed is download an HTML copy and create a PDF. Too many times I've bookmarked something only to come back a year later and it's gone.

pasbesoin · on May 1, 2018

I used the Firefox extension Scrapbook, until the change to WebExtensions killed it.

I feel changes like this are incrementally making the Web "theirs" and not "ours".

Separately, someone replied on here to me, a few months ago, that archive.org's policy for respecting -- or not -- robots.txt was in the process of changing.

I don't think that putting up a robots.txt policy should be able to retroactively remove from archive what was previously public. All the more so when the domain in question has changed hands.

But I expect this nonsense to continue. So, I only trust local copies.

Unfortunately, for me, killing the Scrapbook extension made them less convenient to collect.

disillusioned1 · on May 2, 2018

Do you know about Web ScrapBook, a WebExtensions successor to ScrapBook X?

https://github.com/danny0838/webscrapbook

pasbesoin · on May 2, 2018

Thank you. I will look at this.

I looked for alternatives, around the time Firefox stable transitioned. I recall that ScrapbookX was going to try to transition. There was language about writing to browser local storage, but that appeared to me to be too constrained for my use, as well as not yet being in place.

Since then, I've taken more cursory looks but not found a suitable extension. I recently learned about Wallabag, but it did not appear to have the same amount of facility; nonetheless, I want to set it up and give it a go.

I've been a little distracted, this winter. So, I've not problem solved like I should.

dorgo · on May 1, 2018

What are the intentions of mozilla? Is it "let's take away control from users", or is it "we don't care about what our users want"? What a fail..

Semirhage · on May 1, 2018

Me too, and I don’t rely on cloud or streaming services. If I really like a song or article or book, I buy it some form and then download (if necessary through alternative channels) a copy. I bought a bunch of audio books from Audible years ago, and taking the time to convert them to MP3 has made my life much easier.

Streaming, clouds and DRM protection are fine for discovery, but if you love something, archive it. Storage is so cheap after all!

eneveu · on May 2, 2018

I use [Pinboard](https://pinboard.in/) to bookmark interesting pages. It archives a copy of the page for private use, which I can access later if the page is deleted.

ivankolev · on May 1, 2018

I started using Firefox for Android partly because of its excellent Save page as PDF feature.

ssarodia · on May 1, 2018

Just fyi, on Chrome on Android you can Share -> Print -> Save as PDF.

jakeogh · on May 2, 2018

Me too. Practically every video I have seen since ~2012 went through youtube-dl, along with 100x more that I didn't watch. The memory hole is real, I plan to post my kit sometime, there's a few pieces on github (until?) already. I suspect the majority of the material I deliberately archived will be gone on YT, it's somewhere around 20% and it's going to hit 50. A big chunk is historically significant.

There is a serious push to criminalize memory. The GPDR is the latest attack. It's a huge problem for power centers that people can so effectively lookup the past themselves independent of what was deleted. Archive.org is a gold canary and they know it. They must partner with other people willing to accept copies, it's too easy and valuable to attack them otherwise.

btw (unrelated to video): https://github.com/bup/bup is awesome.

dorian-graph · on May 2, 2018

Youtube is another big one too! youtube-dl is a godsend.

> There is a serious push to criminalize memory.

I hadn't thought about it that way, but it seems so true. And thanks for sharing bup!

amelius · on May 1, 2018

The problem is, it's getting harder every day to find DRM-free content. For audio, CDs are getting out of fashion, and piracy channels lack the seeders because everybody is on Spotify.

twostoned · on May 1, 2018

Yes. The debate is gone. Young people have a much different view of 'ownership' (I struggled for a better word) than older people. For example, I remember the copyright, file sharing, music piracy arguments and debates from the 90s (Metallica, Napster! Hah) and 00s. But when I talk about this stuff now with people in their early 20s there seems to be less awareness. DRM & 'Stream everything' are the way it is, as if its some kind of inevitability. The concept of actually owning, or possessing, something (even if its a byte stream on a physical hard drive in your house) seems to be disappearing. It's interesting to watch.

I think the most interesting part is the lack of discussion.

lenocinor · on May 1, 2018

As someone who used to have a lot of CDs and MP3s and basically got rid of all of them for Spotify, I can cite a number of reasons why I switched:

1. Convenience (I never download or upload anything, and my playlists work and are automatically updated on the devices I care about)

2. Breadth of music (it doesn't have everything I want but it has a surprising amount of breadth in things I'd never care enough to amass deep collections in)

3. Easily accessible playlists from other people (I really appreciate the "This Is <band name>" playlists especially from Spotify)

4. Seeing what my friends are listening to all the time (I get a lot of new music this way)

Yeah, stuff goes away on the service. Yeah, certain less-popular genres are patchy and incompletely represented (and are we ever going to get Tool?). Yeah, the personal library limits are a bummer (although as someone who never uses this feature, I don't care myself). Yeah, the UI is terrible for certain things (classical music is especially bad, and I really hate that single-song repeat gets turned off in so many ways). Yeah, some of their clients are worse than others (why is the PS4 client's sound quality so bad and not changeable?) Yeah, there's no lossless versions of anything (I think).

And yet, for all that, Spotify has transformed my music listening, and I've been listening to a huge array of music for almost 25 years now. I listen to so many more new and interesting artists and songs on Spotify than I ever would have otherwise. I'll never go back, personally.

srean · on May 1, 2018

I have never used Spotify, so a genuine question: would Spotify play the Dark Side of the Moon from start to finish ?

EDIT: Thanks for the answers folks, and yes, playing it out of order, or with interruptions would ruin its sonic beauty.

Doxterpepper · on May 1, 2018

If you have Spotify premium then yes. I've had premium for some time now so I don't know if it's changed but without premium shuffle is always toggled. So without premium you could listen to dark side of the moon, but it'd be out of order.

arnvidr · on May 2, 2018

Shuffle? I've never encountered forced shuffle on my free account. Is this some kind of 'the mobile version is worse' thing?

cjmoran · on May 2, 2018

Yup, for some reason you can select individual songs on the desktop client even using a free account. Not the case on mobile.

8_hours_ago · on May 1, 2018

Yes. I have done this with Spotify on many car trips :)

clarifying edit: This is for the paid version of Spotify. I've never used the free version but I believe that it plays ads in between songs.

chipotle_coyote · on May 1, 2018

In a science fiction novel I wrote, the main character has trouble understanding the concept of data that has a physical location; to her, it's something that's just there. While I don't know if the real world will be that extreme, I don't think it's at all impossible. I think over the long term -- and maybe not even all that long -- it's probably inevitable.

It does require some mind-shifts on all sides, including those of content creators/providers, though. I don't know that I need to "own" any of my media in an everything-available-all-the-time world, but that requires, well, everything to be available all the time. If content availability comes and goes like the tide based on contracts and deals that I'm not a party to, it makes me a lot more skeptical of the implicit everything-everywhere promise.

coldacid · on May 1, 2018

>While I don't know if the real world will be that extreme, I don't think it's at all impossible. I think over the long term -- and maybe not even all that long -- it's probably inevitable.

I think that we're already there. This "cloud" generation seems to think that everything that exists (or at least is worthwhile) just sits on that magical Internet to be streamed to them whenever they want (and pay for it).

solarkraft · on May 2, 2018

> In a science fiction novel I wrote

care to link it?

chipotle_coyote · on May 2, 2018

Sure. The novel is Kismet, which is at the top of this page:

https://coyotetracks.org/for-sale/

And, a direct link to Amazon:

https://www.amazon.com/Kismet-Watts-Martin-ebook/dp/B01MY02O...

Arzh · on May 1, 2018

That's because 99% of the time the streaming service for music is better than trying to build your own library. It also seems to have a lot less attribution than video stream right now so people don't have to pay attention to where they can stream a specific song, they can use just about any app and get what they are looking for.

amelius · on May 1, 2018

> That's because 99% of the time the streaming service for music is better than trying to build your own library.

What good is "better" if songs disappear every now and then?

Arzh · on May 1, 2018

That's the 1% of the time that it's not better. Using a streaming service doesn't stop you from buying albums or song that you really want to keep around for a long time.

blux · on May 1, 2018

One major drawback to me is the recurring cost. My feeling is that building an offline library that you truely own is much cheaper than using some streaming service with monthly recurring costs that inflate over time.

rbrcurtis · on May 2, 2018

Only if you never listen to new music. CDs cost around $10, which is your Spotify sub cost per month. Imagine only listening to one new album every month.

Arzh · on May 1, 2018

After ten years maybe, I have too much of an eclectic musical taste to get away with that.

Can_Not · on May 1, 2018

The streaming service is superior for discovery, but inferior for long term use. It's not a good library when an artist can remotely disable a song. I can always add new discoveries from Spotify to my personal archive. I have the best of both worlds.

Retric · on May 1, 2018

Depends on your tastes. If you only really care about say 100 ish or fewer CD's worth of music then you can easily hit that and save money vs a streaming service. So for someone like me they are a complete waste of money.

Remember, 70 years * 10$ / month ~= 8,400$ for music over a lifetime. By comparison you can easily buy say every piece of music by Bob Marley and your done no need to ever do so again.

Sure, if you really care about music then a service is great.

blux · on May 1, 2018

Even then, your figure of $8400 is very optimistic. Assuming 1% inflation per year people will pay $20 per month in 70 years.

bhelkey · on May 2, 2018

I believe that the parent comment is calculating the cost in 2018 dollars.

kaiken1987 · on May 1, 2018

I see it kind of akin to how so few people carry cash on them. Credit cards and streaming are great for the day to day things and make things much easier. It's important to have a backup for the things that you care about. If the power goes out how are you paying for lunch. If your cloud photo site shuts down or has a data failure you just lost those baby pictures.

dorian-graph · on May 1, 2018

Yup, it's becoming a sad state to be able to own things and not be dependent upon the whims of others.

I went from buying all my music (physical and digital copies), and sometimes had to remove DRM from the digital copies, to stopping that and just using Spotify.

I'm slowly planning to stop paying for Spotify, but it would be very expensive to buy all the music I listen to. I think this will lead to at least 2 things (that used to be true, for me). 1) I'll listen to more local/new music that I can buy from BandCamp, etc. and 2) I'll have higher value to the music I do listen to, because I'm not as worried about glutting myself on a million new bands through Spotify. I'm okay with these 2 things.

amelius · on May 1, 2018

I used to save tracks by recording them through the "analog hole", using the following commandline in Linux:

    parec | sox -t raw -r 44100 -Lb 16 -c 2 -e signed-integer - -t wav raw.wav

It still requires a lot of manual work, such as cutting the tracks, and making sure there are no "skips" (which somehow can happen). Skips could theoretically be removed using a consensus algorithm (using multiple recordings).

I just wish someone would develop a fully automated workflow for converting playlists to audio files.

hiram112 · on May 1, 2018

I'm not familiar with the 'parec' command. Are you able to filter out other audio streams (e.g. Slack alerts or accidentally opening a page with an auto-play video)? I was under the impression that one of the big advantages of Pulse Audio was its ability to separate multiple streams.

Interestingly enough, I had similar scripts about 15 years ago pre-Napster. The earliest mp3 sharing sites tended to push full-albums instead of breaking things up by track. I had a lot of fun using some of the earlier mp3 tools to break up and tag tracks. I still have a lot those mp3s on various HDs, and I know it because my splits weren't perfect for certain tracks that don't have 2 second gaps.

amelius · on May 1, 2018

In theory PulseAudio should be able to separate the streams, or at least turn off audio for selected applications. However, this was never really a problem anyway because I usually ran the command at night :)

Yes, audio processing in the early days was fun, though it's easier now because of better tools and especially bigger harddrives and faster CPUs :)

gumami · on May 1, 2018

What do you mean? Google and Amazon both sell DRM free music. (Apple might too, I've just been out of that ecosystem for so long that I don't really remember)

MrMember · on May 1, 2018

I feel like music is one of the few pieces of media that's very easy to buy DRM free. Movies, TV shows, books, and most other things are heavily encumbered after you "buy" them.

JackCh · on May 2, 2018

It's a pretty shitty state of affairs, but for ebooks what I've done is purchase an old model kindle and buy ebooks for it. I then crack that DRM using Calibre (easy to do with the old model kindles, you only need to enter the serial number). This has worked to archive all the books I've purchased so far but there may come a time when Amazon will only deliver ebooks to kindles with stronger DRM.

I don't feel bad about doing this because I'm still paying for the books and I'm not distributing the backups I make. I'm not clear on whether personal backups are a legal exception or not, but I don't really care.

MrMember · on May 2, 2018

I do the same with audiobooks from Audible, though the cracking process is a bit more complicated. I have no moral qualms about ensuring I will always have a copy of something I bought even if I leave the platform I bought it on or that platform goes under.

scruple · on May 1, 2018

I buy almost all of my music through Bandcamp these days. The nice thing there is that I can buy digital, vinyls, and sometimes even CDs or cassette tapes.

fiddlerwoaroof · on May 1, 2018

iTunes is now drm-free except for streaming

paulie_a · on May 1, 2018

I just use Usenet. I tend to purchase media, or use a streaming service but I no longer pay attention to copyright. I have legal and non legal options. Whatever is easiest is all I care about. Drm/copyright has tilted so far I simply don't care anymore.

biggc · on May 1, 2018

Unless something's changed, you can still buy DRM-free high bitrate tracks from iTunes.

Zathman · on May 2, 2018

I don't typically visit Wikipedia, but when I do, I read something and then try to visit the source(s) for that something. More often than not, the source URLs are dead.

jsty · on May 1, 2018

Archive.is doesn't seem to be affected by the redirect strategy.

article: http://archive.is/gPcBW

makomk · on May 1, 2018

Archive.is goes to a lot of work in order to make archiving pages on widely-used sites actually work, from what I can tell.

MarkusAllen · on May 1, 2018

Archive.is is awesome. It just works.

codetrotter · on May 1, 2018

Archive.is is awesome but I’ve seen multiple really big sites archive poorly. Some of them because the version of Chrome that archive.is is using is getting quite old.

They are running Chrome/41.0, which was released in the beginning of 2015.

https://archive.is/3PxoF

> It is very tricky to run, it depends on an exact version of Chrome, which binary also must be patched in order to reduce security (to allow saving content of frames, etc).

https://blog.archive.is/post/45984102073/can-the-archived-pa...

fiatjaf · on May 1, 2018

Who pays for it?

rambojazz · on May 1, 2018

I've always asked the same question myself... I don't think they are a nonprofit like the IA. Their FAQ says "It is privately funded" and wrt ads "I cannot make a promise that it will not". Years ago there was an archiving service that displayed ads, but unfortunately I don't remember which one it was... I vaguely remember it could have been archive.is, but I'm not sure.

dahart · on May 1, 2018

> So it looks like Medium has embedded a method to frustrate the casual user of Wayback Machine from seeing articles that their authors have removed from the original site.

It strikes me as less likely that Medium is doing something intentional to prevent reading deleted articles, and more likely that the author of this post is making assumptions.

Besides, archive.org has a policy of respecting copyright. All you have to do is ask them to not re-publish, and they will. No need to engineer wacky redirects that don’t work anyway.

http://archive.org/about/faqs.php#20

dontchooseanick · on May 1, 2018

You might like that :

root@localhost:~# links -dump 'https://web.archive.org/web/20160826003417/https://medium.co... | nc seashells.io 1337

Results at https://pastebin.com/SMGBscz2

scandox · on May 1, 2018

I used w3m -dump https://web.archive.org/web/20160826003417/https://medium.co...

Produces a fairly nice, readable version also.

dvfjsdhgfv · on May 1, 2018

Actually it's not necessary to use text-mode browsers as the trick used here is JS-related, so it's enough to switch JS off.

Actually most nuisances on the web today are JS-related so I have a button for quickly disabling it. It works like a charm, also for this case.

imglorp · on May 1, 2018

Hey, seashells is neat. I can't get a non-http port out of corporate nannywall though.

baby · on May 1, 2018

When I delete an article from my blog, it's because I don't want anyone to be able to read it anymore. Be it shame, inaccuracy, change of mood, etc... I think there is something fundamentally wrong with wanting to have EVERYTHING backed up at all time against the creators' will.

ekianjo · on May 1, 2018

> When I delete an article from my blog

It's not yours anymore once it's public. It's like saying "when I do something crazy in public I want to be able to make everyone forget about it later on". That's not how things are supposed to work.

> I don't want anyone to be able to read it anymore

People can still have local copies so you have absolutely NO CONTROL.

vinceguidry · on May 1, 2018

Publishing used to be a really heavy-weight process, because printing and selling access carried a great deal of expense and ceremony. Trying to keep your book out of the local library to enhance sales revenue was a no-no.

Interesting how the landscape and conversation has shifted. Sadly, it doesn't appear possible to even have a definition of publishing that both maintains the right of the public to access information, and allows for individual privacy.

Options like disabling crawling are insufficient, essentially web servers have to read author's minds to divine their intent in order to not screw up. Don't crawl certain kinds of content and you might be accused of discrimination, providing services to one group and not others.

Ownership is a weird thing.

a3_nm · on May 1, 2018

> People can still have local copies

Yes, but under current copyright laws they are not allowed to distribute them without your consent.

djsumdog · on May 1, 2018

Depending on the license. If you have a CC-BY-SA or public domain at the bottom, then you're allowing people a legal right to keep a copy.

In practice, the nature of the web allows everyone to keep copies of everything if they want. But if someone republishes it, then you need to use the DMCA process (for American websites) to take that content down if it violates copyright.

Going back to print, what if you print something you don't want out there anymore? Well if people have already bought your book or magazine, they have bought a right to that physical copy. They can even sell the book or magazine to anyone else (granted that the content isn't illegal).

danso · on May 1, 2018

But they can't make copies of their bought copy and distribute those, which is akin to what the Internet Archive does.

ekianjo · on May 1, 2018

That's not the point. The OP mentioned he wanted to make sure nobody can read it anymore. Private copies prevent that scenario.

zmk_ · on May 1, 2018

Except they do in Europe under the 'right to be forgotten'.

lostmsu · on May 1, 2018

That's what they think. Fortunately for the Internet, there are lots of unaffected servers outside it.

StanislavPetrov · on May 1, 2018

And unfortunately for Joy Reid.

ekianjo · on May 1, 2018

> Except they do in Europe under the 'right to be forgotten'.

It's not because they say it exists that it makes it valid in practice.

coldacid · on May 1, 2018

I could say that "black is white" and yet it won't be so. The EU could say it, and afterwards black would still be black and white would still be white.

CPLX · on May 1, 2018

> It's like saying "when I do something crazy in public I want to be able to make everyone forget about it later on". That's not how things are supposed to work.

Says who?

ekianjo · on May 1, 2018

How would you go about erasing actual memories?

pwinnski · on May 1, 2018

In real life, things done when a person is young are often forgotten with time, and the way the brain works is that time lessens the importance of a memory, so even if it's not completely forgotten, the memory is fuzzy and considered inconsequential.

On the internet, everything is as fresh as this morning, forever. No longer do we have "do you remember when so-and-so used to say that stupid stuff, boy have they changed!" Now it's "look, here's a link to so-and-so staying stupid stuff, now we know what they really think!"

The idea that once something is in the public it should remain there firmly embedded, forever, makes sense on the surface, but definitely seems to break down when examined closely, in my opinion.

CPLX · on May 1, 2018

The passage of time is the typical method.

ekianjo · on May 1, 2018

What if someone records their memories on paper?

djsumdog · on May 1, 2018

I really like this post:

http://archive.li/fi5Xn

It's been deleted by its author and archive sites are the only places where I can find copies. I've saved a copy for myself just in case. If you use an article like this as a source, it'd be nice if there were a copy somewhere.

This is one of my favourite independent movies:

https://www.imdb.com/title/tt1527628

I bought a DRM-free copy from the writer/director back when they offered it on their website. The main website is still there, but the whole purchase/download system is broken and those domain seem to have expired/been purchased by someone else. I almost lost my copy of this movie, but luckily I found it on one of my off-site backups.

People talk about how much content there is being created, but there's an incredibly amount of content that's being lost forever. Even if it's still out there, search monoculture (today we have Google/Bing/DDG where once we had Lycos, Hotbot, Altavisa, etc. etc.) can effectively keep content from being accessible. There might be something nice about the ephemeral nature of that content, but there's also something sad there as well.

To go back to your point, if someone publishes an article, it is nice to be able to see it again in the future. If they don't want it backed up, there are procedures like DMCA (if the author didn't publish the content to the public domain and the archiver is based in the US).

As a side note, we've already seen on here that the Right to be Forgotten is more about censorship than anything else.

Eventually we'll all go extinct, our sun will burn out, and everything that ever was and is will be lost. So preservation efforts really only go so far, and this brings up some more deep philosophical ideals about the ephemeral nature of what we produce more than anything else.

Bartweiss · on May 1, 2018

> if someone publishes an article, it is nice to be able to see it again in the future.

I think there's an important point here about the difference between access and attribution. People talking about the right to be forgotten are generally opposed to attribution - someone like the top-level poster wants to be able to un-claim a blog post. But people talking about archiving are split between attribution and access - wanting to simply be able to see content, regardless of where it came from.

Two of my favorite bloggers have deleted large swathes of their work, both for reasons I think are inapplicable to me. In one case, they got a job in medicine and removed lots of content that might be unoffensive generally, but could upset a hospital HR department. In the other case, I believe she was worried about the impact her work might have on suicidal people.

In each case, the author wanted to stop having a comprehensive, owned body of their writing, while I simply wanted access to the text. I could give a damn if they accept ownership of that writing - it had interesting ideas and I simply want to be able to read it again.

This isn't a distinction I see made often; work is either in its source location or archived in an attributed way. But there are some cases where I'd be quite happy to get un-attributed access to the actual content someone created.

nbeleski · on May 1, 2018

This is an interesting article, if you liked I would reccomend heavily taking a look at Nick Bostrom's Superintelligence. Even though the book has it's own share of problems, it is an interesting approach to Artificial General Intelligence.

On the other points, I would believe this is related to the fact I personally (and I take most here) have spent way too much time on the internet on the past decade(s). I find it amusing the number of times I remember something I've seen or read in the past and how hard it can be to find it again. Or the number of broken links among old blog posts and articles.

Lastly, our lives are way too goddamn short. Thinking that far into the future is hardly productive in my opinion, even if you consider it can be very enlightening.

undefined1 · on May 1, 2018

> It's been deleted by its author

That's what AI wants you to think. ;)

tempodox · on May 1, 2018

Thank you for the article link. If anything, it has only become more relevant by now. Do you know why the original has been deleted?

gnode · on May 1, 2018

As I mentioned in a top-level comment, you can use robots.txt to disable Internet Archive's (Wayback Machine) archiving of your site. If you dislike your site being archived, I suggest you do this. Their intention is not to be a monster of the Internet who embarrass creators against their will.

fiddlerwoaroof · on May 1, 2018

This is a major issue with the internet archive because it allows subsequent owners of a domain to blackhole the previous content.

gnode · on May 1, 2018

One solution to this might be to maintain separate archives when a domain changes ownership, based on WHOIS information. Previous owners could still request deletion via email as is currently possible.

swebs · on May 1, 2018

Yes, I think that's pretty obvious. This is an issue of the public's interest vs the author's interest. Of course anyone would like to rewrite history to suit themselves. It does a disservice to everyone else however.

"We were always at war with East Asia" and all that.

bufferoverflow · on May 1, 2018

When you published your article though, you wanted people to read it. You can't blame the people around you for changing your mind. And "nothing gets deleted on the internet".

baby · on May 1, 2018

I won't blame people for storing a private copy, but I will blame people for sharing such copy if it is against my will, especially if they're claiming it is their right and thus my integrity has to suffer.

monoch · on May 1, 2018

You are breaking the social contract of copyright when trying to use it as a tool of censorship. The whole underlying rationale for modern copyright was spelled out in the first country to adopt it:

>To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.

We, the people, trade our natural right of copying and selling anything we see for the benefit of increased dissemination of art and science.

Your integrity is kept perfectly pristine if your work only stays in your draw. The second you publish your work it is no longer yours.

sgwilliams · on May 1, 2018

What social contract? Technically, archive.org is in violation of copyright. It does not matter if robots.txt allows it or not. Pointing people to "content" like a search engine is fine. Displaying "content" on another website is not.

Even worse is nyud.net, which essentially steals credit.

pavel_lishin · on May 1, 2018

> and thus my integrity has to suffer.

What sort of integrity is it, that can be adjusted by erasing your past actions, to hide the mistakes you've made and only present the successes?

OrganicMSG · on May 2, 2018

That was put well enough to make me wince on reading. Ever thought of going into politics?

pavel_lishin · on May 2, 2018

Nah, I think Google has documented my integrity well enough to make that a non-starter :)

OrganicMSG · on May 6, 2018

I just looked on the news. Apparently none of that matters.

OrganicMSG · on May 1, 2018

It is the right of all librarians.

loceng · on May 1, 2018

What about a situation where say someone who has a perfect memory, reads it, and regurgitates it word for word or even close enough to get the same meaning across?

The problem I see with once you've published something public - whether you've deleted it or not - is if and when they reference it (especially if it's older) in part or whole -- is if they reach out to ask you if since posting/writing something, if there are any comments or updates to it, so they then post that inclusion.

Perhaps your knowledge wasn't as evolved, therefore your understanding wasn't to conclusion - which is what the process of learning is all about. Perhaps you were going through a difficult time and you revealed things you're embarrassed about, and you have a fear of ridicule or other. That's where understanding and compassion then hopefully kicks in with the reader. If you can understand these processes yourself, and be able to forgive yourself and therefore others - or forgive others and therefore yourself - then great. If worry is strong and remains strong, then perhaps the solution is developing self-awareness to understand the nitty-gritty and nuances of emotion, to help understand the worry, where it comes from - and developing the tools and skills to help them settle.

Edit to add: You'd be surprised at what people deal with on a day-to-day basis or what they had to deal with in the past. Suffering, especially emotional suffering including discomfort and fear, is a strong teacher - it's something you have to develop to be open to - otherwise the ego mind will learn and want to logically control a situation, vs. managing it - which comes with developing self-awareness, therefore better self-control, and better self-regulation.

Where I'm at in my life, on my path, I currently fluctuate between having difficulty coping with chronic pain that I have to manage which affects my executive function and decision making, and between being very suicidal; most infuriating is that I have had success with healing some of the pain with stem cell injections, however for unreasonable/irrational decision that was made, I've been blocked from getting more from the doctor who first did them - and that has cascaded into making it challenging to find another doctor to continue them, I've come to the conclusion that our health-"care" system is very broken, and which I could go into the nuances of from my experiences, however I won't in this comment.

Yes, there will be assholes who will judge me for sharing that and think negatively for that, and they are people haven't developed those skills and understanding or compassion - at least not yet, or perhaps never, however we can forgive them for that - their genetics and life path, the environment they were born into wasn't up to them.

logfromblammo · on May 1, 2018

I'm not sure you understand that putting something on an http server is publication. It is legally no different than standing on a street corner handing out flyers.

You are suggesting that the author of a flyer should be able to demand that everyone who took one burn it immediately, and have the force of law to ensure it happens.

That is not how copyright works. Copyright is intended to encourage the creation of new works, by granting for a limited time the exclusive right for an author to reproduce and copy their own works. Once those copies exist, and pass out of the author's possession, copyright does not grant any further control of them, other than to forbid those copies to be used to produce additional copies.

This, of course, raises the question of whether serving a digital document on an http archive server is violating the author's copyrights. A library may keep a copy of a pamphlet and allow patrons to view it without violating copyright. But web servers work by stamping out a perfect copy of the document and sending it out over the network. There is no physical embodiment of the document. If the archive were to display the document on a monitor, and then serve a video from a camera, pointed at that monitor, that would be analogous to viewing the physical copy of the pamphlet, but then that pointless fiction could be dispensed with by removing the monitor and camera and transmitting the digital rendering. So we have to fall back on the intent of the copyright act.

Clearly, the copyright act is intended to expand the amount of available works, by granting a temporarily profitable monopoly. Does the prevention of archiving further this purpose? Hell no. Archiving is essential for those works to eventually enter the public domain. When the original creator of a work has abandoned their attempt to monetize their efforts, to the point where they are now trying to destroy their work, it should escheat to the public domain immediately. If you didn't want it out there, the only remedy would have been to never publish it. You cannot erase prior publications by abusing copyrights. The law should not protect book-burners.

chipotle_coyote · on May 1, 2018

> It is legally no different than standing on a street corner handing out flyers.

It's not legally different, but it's still different, which is something that I think people on both sides of this debate sometimes selectively forget. Before the web, those "street flyers" were pretty unlikely to go viral and be seen by millions of people. They were pretty unlikely to get, well, much farther than that street corner. And there certainly wasn't a widely-known and shared infrastructure dedicated to capturing copies of the flyer and preserving them indefinitely.

I don't know that there should be a "right to forget," but in the pre-digital era, things had limited circulation. They went out of print. You couldn't control what happened to copies after they were printed, no, but nobody could say, "You know what, I don't want this thing to go out of print, so I'm going to put it back into print whether the author likes it or not." To use your flyer example: I can't legally demand people who have my flyer burn it, but I can legally demand that people don't make copies of my flyer and hand it out on street corners of their choice for the next thirty years.

The closest thing the web has to the concept of "out of print" is, well, taking things offline. And a lot of things that go offline undoubtedly should be preserved. I use the Internet Archive all the time. But at the same time, I'm not convinced that the answer to someone saying, "Hey, this thing I put online 20 years ago and took down 10 years ago is something I'd really like to keep out of print" must always and forever be, "well, you should never put anything online that you'll ever reconsider at any point in your entire life, you fool."

logfromblammo · on May 1, 2018

I think the "out of print" argument is a cop-out. That's allowing the economic constraints of physical copy production to override the ideals behind the law.

Things went out of print because the unit cost of producing one extra copy was much higher than producing 10000 extra copies. As demand for copies tends to taper off over time, you eventually reach a point where you simply cannot produce just one extra copy at a cost lower than the price the next customer would be willing to pay for it.

No such pressure exists for digital reproduction. Every additional copy costs the same low, low amount. The author then has no reasonable argument for refusing to make an additional copy.

And yes, there was infrastructure for capturing and preserving copies of print flyers. It wasn't all-encompassing, and didn't catch everything, but there are many museums of ephemera now that have extensive collections of published material that was of limited circulation (and limited literary value). For those items that were expected to get thrown away or used as toilet paper, there was always the possibility that someone might have saved it, and it could still be around in some form 200 years later.

It is entirely reasonable for an author to demand that no one else make and distribute copies of their work. But in my opinion, if you can find the author, and make them a reasonable offer for a new copy of their copyrighted work, and they refuse to make one and sell it to you (or to license the right to make your own) then they have essentially abrogated their copyright. You would then be morally (but not legally) justified in copying that work from another source.

When something is published, the genie is out of the bottle. No law can stuff it back in. And copyright was intended to protect the livelihoods of creators, not to give them the ability to more easily destroy what they have wrought. Thus, whenever there is any confusion or ambiguity, I always personally interpret a copyright situation with the test "is there any way this might lessen the creator's ability to sell (or otherwise monetize) one more copy of this work?"

If the creator is no longer attempting to make money from a work, screw their copyrights. We granted them that limited monopoly to make enough money so that the effort of creation would be worthwhile to them. If they don't care to sell, I don't care to protect their ability to sell exclusively.

seiferteric · on May 1, 2018

Why not publish a retraction and list the reasons why you no longer think the post is correct instead of trying to hide from it? Maybe someone will learn something that way...

baby · on May 1, 2018

Why not just erase it.

OrganicMSG · on May 1, 2018

Anything you have ever published is no longer entirely yours and the act of doing so is also a relinquishment of control. Only secrets can ever really be regarded as possessions. Archivists should not require permission to store that which has been put out into the world.

_ugfj · on May 1, 2018

> Anything you have ever published is no longer entirely yours and the act of doing so is also a relinquishment of control.

This is an incredibly bold and huge presumption which absolutely does not mesh with the Berne Convention to begin with.

emddudley · on May 1, 2018

It meshes perfectly fine. You don't give up all control but you do give up a lot of control.

When you publish and sell a book, you no longer control who that book is later given or sold to. You retain only rights over the ability to make copies, and even then, you can't block people from making copies for personal use.

OrganicMSG · on May 1, 2018

It meshes well with reality though.

waydowntogo · on May 1, 2018

Mainly problem with any services is privacy. There's no privacy on internet.

If you have a blog you have to count with fact that an article will be on internet forever.

My advice is this - write an article today but publish it tomorrow (you'll have time for thinking about an article).

coldacid · on May 1, 2018

And I'd add to that advice: Never be afraid to add corrections and show them.

paulie_a · on May 1, 2018

Suck it up. Once you go out of your way to make something public...it will remain public

baby · on May 1, 2018

Not if you delete it.

paulie_a · on May 2, 2018

If I downloaded it...it's not getting deleted regardless of the authors wishes. If you put it on the internet it's not going away ever

sarabande · on May 1, 2018

Sounds like you need the GDPR's "Right to be Forgotten" clause to be in effect.

drchaos · on May 1, 2018

The GDPR only cares about personally identifying information. So you might be able to ask me to remove your name from the archived blog post, but not the post itself (as long as it does not contain other PII). This is very important, as otherwise someone could vandalize Wikipedia or Github by requiring them to delete all his edits/commits.

zeth___ · on May 1, 2018

A simple solution: don't publish things you don't want people to read.

This is no different to burning books that the authors no longer support the views in.

bachmeier · on May 1, 2018

That's actually pretty different. If you purchase a book, you pay for a copy with the expectation of both parties that you own that copy.

Putting someone else's work on a different website and serving it to others would be similar to printing copies of someone else's books. If the author changes his views, it wouldn't be acceptable for someone else to print new copies of the book. Making a copy of a website for your own personal use would be much closer. I don't think anyone disagrees that you should be able to do that.

zeth___ · on May 1, 2018

>If the author changes his views, it wouldn't be acceptable for someone else to print new copies of the book.

When the author burns their own book and forbids new ones from being made is the only time that it's acceptable to print new copies.

Need I remind you of the purpose of copyright?

>To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.

foldr · on May 1, 2018

>When the author burns their own book and forbids new ones from being made is the only time that it's acceptable to print new copies.

Do you mean acceptable from a legal point of view? Because I seriously doubt that it is.

OrganicMSG · on May 1, 2018

Without that view, we wouldn't have most of the works by Kafka.

foldr · on May 1, 2018

No, the only issue raised by Kafka's request to burn his papers is whether or not the copyright passed to someone else on his death. All works eventually become public domain, regardless of the author's wishes, since people die and copyright has a time limit.

bachmeier · on May 1, 2018

I don't understand your point. "securing for limited times to authors and inventors the exclusive right" means others cannot print copies of your books.

aeorgnoieang · on May 1, 2018

> for limited times

The author's exclusive rights are only intended to be temporary. Once they're no longer secured, the work can be used by anyone for any purpose.

OrganicMSG · on May 1, 2018

>If the author changes his views, it wouldn't be acceptable for someone else to print new copies of the book.

That depends entirely on the publishing contract's rights reversion stipulations. An author deciding that they do not want their book to continue being printed, purely because they have changed their views, will often find that they have very little legal right to do so.

pmlnr · on May 1, 2018

I both agree and disagree.

Yes, definitely think before publishing, even if it's meant to be "private", say a self-destructing private message, because anything can be saved somehow.

On the other side, we change. Things that were relevant to me, things I genuinely thought to be true 16-18 years ago (I had a blog for a while now), I see a bit differently today, and I might like my current, tuned back views to be read, not the ancient ones from a disappointed teenager.

baby · on May 1, 2018

So you're incentivizing people not to write. That's bad, we should encourage people to write. It's OK to make mistakes.

I see this problem as an even bigger problem with kids. Kids put everything they do online, and they will probably be ashamed of a lot of these things later in life.

Comparing that to Facebook: I guess I shouldn't upload pictures on Facebook because it's against my right to want to see one of my old picture disappear later?

Freak_NL · on May 1, 2018

People should be encouraged to create, but with the knowledge that anything published may be retained by others, and that it can have consequences. No technological measure will prevent people from making personal backups or gaining access to data published under the presumption of secrecy or time-limited availability — even if all the layers of DRM work, the analogue hole will always be there.

Rather, invest in teaching kids how to safely publish under pseudonyms or anonymously if they wish to publish their angst-ridden teenage vampire poetry. You can always abandon your connection to that work that way — even if the work lives on for all the world to see.

> Comparing that to Facebook: […]

You should indeed never upload anything that you might wish to expunge at a later date. You have the right to see an old picture gone from Facebook, but you don't have the means to enforce removing it from your cousin's private backup on their own computer.

parenthephobia · on May 1, 2018

> It's OK to make mistakes.

A sentiment that is harder to sell if people erase all evidence of their mistakes.

_ex6b · on May 1, 2018

More like incentivizing people to accept responsibility for their actions.

khedoros1 · on May 1, 2018

> I guess I shouldn't upload pictures on Facebook because it's against my right to want to see one of my old picture disappear later?

If think that you will ever want real control over the pictures, then yes, you should avoid posting them to Facebook. I'd think that that's fairly obvious.

zeth___ · on May 1, 2018

If no one can read an a book what good is it?

You are purposefully muddying the issue of access. If you upload your pictures on a password protected ftp server no one will know or care when you delete them. If you upload them on a publicly accessible website, despite what outdated laws on copyright say, you won't be able to withdraw them and that's a feature.

Facebook is a monstrosity that pretends to be a password protected server but leaks your data by design. Do not try to equate someone unpublishing a work after they wrote, uploaded and publicized it with someone removing a drunken status update that should have never been public in the first place.

csydas · on May 1, 2018

> If no one can read an a book what good is it?

Plenty of good. As a writer myself, sometimes just having these thoughts put out into form is a very gratifying process. When the inspiration hits you, nothing feels worse than not being able to express these thoughts and feelings in a way that feels appropriate.

Likewise, sometimes people write dumb things in a fit of passion. They write something that is a blemish on their otherwise fine history or that no longer reflects what they believe anymore.

While I'm on the side of the Internet Archive here, I can definitely appreciate that it's not an open and shut case. Sometimes the yearning for information to be free is at odds with our want for privacy. Tools like Medium, Twitter, Facebook, and other social media are like the gun in the house to someone suicidal (to use a very bad analogy) - an easy and convenient tool that allows for a very bad spur of the moment decision to be made.

I know I've done stupid things on games and on message boards in the past, and I'm only so lucky that this data likely isn't available anymore. Some of it was me being a dumb kid. Some of it was me just being an angry kid, but I am 100% grateful that this information is only remember by myself and a handful of others. A very different past me had a very stringent set of beliefs which I now have come to accept were very bad beliefs, and I did bad things in general as a result of them.

Is it really fair that I have the benefit that time forgot all these dumb things I've done simply because I was born before a time when Twitter/Facebook were common place? Before data permanence was really possible on a global scale? I'm open to the idea of some review on archiving data like this; I want the Internet Archive to be able to archive this stuff, but it would be really nice if there was a way to vault it for a reasonable period of time as well by request. Otherwise, you end up in a position where no one wants to write or produce or do anything in a fit of passion as a result of knowing that everything is permanently preserved.

I don't have an answer aside from "vaulting" the data, and I don't think that's a good answer. But I also don't think it's black and white like you're trying to make it.

iamnothere · on May 1, 2018

I agree with the sentiment, but I'm not sure what can be done about it. Anyone can archive things privately. If you take away public archives, this means that only obsessive lunatics and the wealthy will be able to wield archived material as an instrument of power.

Our culture definitely needs to evolve a little and become more forgiving of youthful ignorance. That's really the only long-term solution.

Perhaps this is also a good use case for services like Hermit[1], which allows limited sharing among friends and other writers. The notion of "trying out" an idea on a platform with enormous public reach seems foolish at best. Comics test their new material in hole-in-the-wall dive bars for a good reason!

[1] https://gethermit.com

csydas · on May 1, 2018

I think you're right, that ultimately society needs to mature and realize that everyone did something stupid in their past and at some point has thought/said/done stupid things. We should hold people accountable when they refuse to change, but right now we're missing the rehabilitation aspect of all this and just focusing on the punishment.

Again, I really don't have an answer. I would err on the side of caution and say IA should continue to back stuff up and it's wrong to set up the time-bombs that also affect IA. But I do feel we need a way to accommodate some privacy still without silencing people outright for just plain dumb opinions and ideas.

baby · on May 1, 2018

I wrote a book and never published it, the only person who's read it is me.

baby · on May 1, 2018

of course, but you shouldn't be stopped by people who want to backup your content at all cost against your will.

LiveOverflow · on May 1, 2018

I think a difference can be made about the individual "right to be forgotten" and the transparency/accountability of a public company.

baby · on May 1, 2018

people like you and I can write on Medium.

andai · on May 1, 2018

> deleting something from the internet

azeotropic · on May 1, 2018

Writing is not the same as publishing.

PurpleBoxDragon · on May 1, 2018

>So you're incentivizing people not to write. That's bad, we should encourage people to write.

The publicizing of things said by then Presidential candidate Donald Trump leading up to the 2016 election would incentivize people to not talk in private. Should we have banned any details about the incident from being spread if the one who said it didn't agree with knowledge of what was said spreading?

If being able to undo what you wrote so that it won't be held against you is a good thing, why wouldn't being able to undo what you said so that it won't be held against you also be a good thing?

fasj82 · on May 1, 2018

>It's OK to make mistakes.

We should punish mistakes. Especially stupid mistakes.

thogenhaven · on May 1, 2018

I wonder how wayback machine will work after GDPR? I can't imagine they can just show content that the authors deleted from primary sources?

josteink · on May 1, 2018

GDPR is not relevant to this. It’s not about enforcing copyright on people’s work.

It’s about ensuring that companies only store and process privacy-sensitive information about people which they are given consent to store and only used for the purposes the consent was given.

There is nothing privacy related wrt the author in a public article published worldwide for everyone to read. Clearly outside the domain of GDPR.

It’s not hard people, just common sense. Just treating user-data with respect. Let’s not fool ourselves into thinking it’s harder than it actually is.

walshemj · on May 1, 2018

However from experience GPDR will also be abused by jobsworths to avoid doing something either through laziness or for more suspect reasons.

Just like H&S and the Data Protection act are abused today.

detaro · on May 1, 2018

I assume they'll continue as now: take down copies on request. It's not like they didn't have to deal with content people didn't want to archive them before.

jakeogh · on May 1, 2018

I doubt it. GDPR is the obvious next step in the war on GPC (memory specifically). archive.org exists to fix that. If they fall because some other country has no 1st, they fail. Other people have copies.

Who exits next?

marksomnian · on May 1, 2018

rainbowmverse · on May 1, 2018

I assume they mean General Purpose Computing.

rambojazz · on May 1, 2018

I haven't read GDPR in details, but isn't GDPR concerned only with personal/private data? The Wayback Machine only archives public pages as far as I can tell...

scandox · on May 1, 2018

Yes but the article itself is user-provided content to Medium that the author has a right to ask to be deleted (under GDPR), presumably? So perhaps it will be simply a matter of the The Wayback Machine having to have a policy to delete things if requested?

lbriner · on May 1, 2018

No! GDPR is about personal data, which is well defined in the regulations and does not include blog posts. The right to delete data (or "be forgotten") is nothing to do with GDPR. If the original post contained personal data, it is a different issue but if that was put out into the public domain, it is a hard problem to solve.

damontal · on May 1, 2018

What if the blog post contains personal data?

avereveard · on May 1, 2018

Most pages have an author section already.

jakeogh · on May 1, 2018

So add some "personal data" to the end of anything you might want to demand someone forget later.

icebraining · on May 1, 2018

Or you could send a good ol' DMCA takedown request.

I'm not sure where this idea that nothing could be forced off the web before the GDPR came from.

LaGrange · on May 1, 2018

No, if you intentionally made that data public then it's done. GDPR doesn't, say, force you to remove political views of Theresa May from newspapers, despite that being covered by personal data, because Theresa May made those views public.

jakeogh · on May 6, 2018

So if was subject to the GPDR, and published my nginx logs in real time, I could stop worrying about scrubbing "personal" data from them on request?

mjn · on May 1, 2018

The Wayback Machine has always had a policy to delete things if requested, so there's no real change there. The most common way site owners do that is by changing robots.txt. In line with the Oakland Archive Policy [1], the Internet Archive respects robots.txt retroactively, so a site owner can get archived versions deleted just by excluding them in the robots file. Besides that, they respond to DMCA takedowns, one-off removal requests [2], etc.

[1] http://www2.sims.berkeley.edu/research/conferences/aps/remov...

[2] http://archive.org/about/faqs.php#2

adventured · on May 1, 2018

Changing robots.txt does not delete content from their archives. If you remove the robots.txt file, the content becomes viewable again.

There's no scenario where they can respond to the vast scale of GDPR violations that their archive likely represents, when it comes to manually removing content. There are only three possibilities: avoid the EU as much as possible, dump the archives and start over with an entirely different approach, or shut down. Besides that, these laws are going to get a lot more strict and difficult to comply with, not less strict, over time. This is merely the beginning of aggressive regulation of the Internet. Regulation of the Internet will only move one direction from here, in the direction of increasing burden and ever greater regulation. It's hard to imagine Archive.org's archives surviving what's coming.

icebraining · on May 1, 2018

There's no scenario where they can respond to the vast scale of GDPR violations that their archive likely represents, when it comes to manually removing content.

"GDPR violations". What's that, exactly? As far as I know, you only have to remove personal data upon request, no preemptively. So I don't see how they are "violations".

Will a lot of people make these requests? Possibly, but where's the evidence of that? People have been able to use copyright takedown requests (e.g. under the DMCA) forever, yet the Archive is still around.

tankenmate · on May 1, 2018

Actually the recommended data handling says you should specifically state the purpose for needing the data, and that it should be reasonably limited to that need; i.e. if you don't need it any more you should pro-actively delete it.[0]

[0]https://ico.org.uk/media/for-organisations/documents/1475/de... Pages 4-6

drchaos · on May 1, 2018

They do have a legitimate interest (in the sense of article 6(1) of the GDPR), namely providing an internet archive.

tankenmate · on May 2, 2018

I would agree in the case of the wayback machine they have a very strong case under article 6(1).

rusk · on May 1, 2018

Could conflict with “right to be forgotten” however

merinowool · on May 1, 2018

If public page can be linked back to you then can be considered personal information and therefore be subject of GDPR.

josteink · on May 1, 2018

No. That’s not how it works.

Read the law before posting wildly misleading comments like this.

If you explicitly make something public, you can’t later come and claim that this information is actually crucial to your privacy. If so, you yourself was the one who violated that privacy, not the company later archiving/caching/processing your public article.

GDPR is all about decency and common sense wrt. user data and privacy.

No need to spread FUD about something that simple. SV proved tech companies can’t be trusted to act ethically, so here comes the regulation. Deal.

adventured · on May 1, 2018

Given the immense scale of Archive.org, there must be a truly incredible number of sites & pages with personal data & content in the pages. Millions upon millions of pages, due to the repeat archiving.

Comments with usernames. Comments with ip addresses (sometimes old comment systems would allow you to comment without registering but they'd show all or part of your ip address). Comments with personal information in the messages. Comments with email addresses. Blog posts with all sorts of personal details from the author. Personal user account pages, such as the kind you see on sites like Ask.fm or similar, with vast amounts of user information and personal details that can't be deleted. And on it goes. Archive.org is storing all of that and does not allow it to be deleted. Further, it would be nearly impossible to figure out what content is compliant and what is not within the archives. It's a giant GDPR violation system. Their only sane bet is to stay way from the EU jurisdiction wise as much as possible, or shut down.

teamhappy · on May 1, 2018

Why would GDPR apply to the internet archive? It's a US based nonprofit. As far as I can tell they don't do anything that even remotely hints at them providing services to EU residents (like offering their site in European languages, having the €-symbol somewhere on their donations page or any of the other more subtle things mentioned in GDPR).

SmellyGeekBoy · on May 1, 2018

Is English not a European language?

teamhappy · on May 1, 2018

Doesn't matter. Using the English language doesn't imply that you're offering goods or services to individuals in the EU.

H4CK3RM4N · on May 1, 2018

No more than Spanish, and you wouldn't be making this argument wrt a Mexican company.

zeth___ · on May 1, 2018

Not since England left the EU.

robin_reala · on May 1, 2018

Britain is still in the EU and GDPR will apply there.

_nalply · on May 1, 2018

And Ireland?

zeth___ · on May 1, 2018

Ireland's language is actually Gaelic.

It's the only country in the EU that does not require it's language in translations because even in Ireland nearly no one speaks it.

JimDabell · on May 1, 2018

Ireland has two official languages defined in the constitution – Irish and English.

kgwgk · on May 1, 2018

I think he refers to the fact that each EU country notifies one official language and only UK picked English. But I guess they have found, or will find, a way to maintaing things as they are.

http://eltelawjournal.hu/what-language-for-europe/

merinowool · on May 1, 2018

Because the data they have might have been produced by EU citizens.

teamhappy · on May 1, 2018

Who cares? Me posting my (very German) name and address on somebody else's website (blog comment, forum or whatever) doesn't magically make that person have to comply with GDPR.

According to [1] the law applies to:

1.) a company or entity which processes personal data as part of the activities of one of its branches established in the EU, regardless of where the data is processed; or

2.) a company established outside the EU offering goods/services (paid or for free) or monitoring the behaviour of individuals in the EU.

The internet archive doesn't offer goods or services in the EU (if you want to know how that's defined you have to read the actual law I'm afraid) and they're certainly not "monitoring the behaviour of individuals in the EU".

[1]: https://ec.europa.eu/info/law/law-topic/data-protection/refo...

jakeogh · on May 1, 2018

So what? The EU does not get to make laws for other people. That's why we have countries.

fredoliveira · on May 1, 2018

You are missing the point. The EU makes laws that govern - and at least try to - protect its citizens. If a document on the archive is created by a European citizen, then it is under EU law. That's why every company in the world right now that deals with European citizens is working on supporting GDPR. That also applies here.

donohoe · on May 1, 2018

Not quite. The EU might want that but it gets into jurisdiction.

The EU cannot enforce its law on entities that are entirely US based. It can only enforce it on non-EU sites if that site has some sort of business that’s within the EU (like offices or employees).

DanBC · on May 1, 2018

The EU can say that some businesses are so uncompliant with GDPR that they're not able to be used by EU companies. It seems weird to chose to limit your market just because you don't want to protect user data.