The Internet Isn’t Forever

John_KZ · on Feb 23, 2018

Also a lot of surviving websites are no longer so accessible. ie Youtube and reddit comments, panoramio pictures (which google kept but you can't see, even if you took them), even capturing a large amount of tweets is pretty hard right now.

Companies are closing down all data they can grab. Youtube used to have a list of the top 1000 videos ordered by view count and other parameters. It allowed you to explore and filter their data. To choose what to see. To index their content. This is no longer possible. They want to keep everything they know about you private, and they want to control what you see and "discover".

This is pretty far away from the open web we had 10 years ago. Part of the blame is Javascript, but most of the blame is on users who don't care. Smartphones ruined the internet in my opinion. Sure, there's more content than ever, but it's not freely available. In fact you're not even allowed to know what's available.

baud147258 · on Feb 23, 2018

I've tried to read this week an (unfinished) fantasy novel on a forum, which has been there for a few years. I had already read a few chapters and it was very well-written. Except the author decided last month to remove all his content from the place (10000+ forum posts), which included this unfinished novel and two others. I tried to find back-ups (on the internet archive), but only the first page of each thread was saved. I only found a few chapters using google cache. And I don't think the mods will agree to restore the content. So yeah, I can relate with the idea that internet isn't forever.

baud147258 · on Feb 26, 2018

And in addition to the disappearance in the same site of many image-based let's play with the changes with imageshack and photobucket.

StellarTabi · on Feb 23, 2018

I got burned by this yesterday. I was looking for an old photo that was only on myspace, but apparently they had a huge content purge and it was long past the migration phase :(

agumonkey · on Feb 23, 2018

This era is so weird.

On one hand internet promised information highway, pure digital storage.

Instead it's social network noise highway and same bitrot.

gumby · on Feb 23, 2018

The tech may have changed but the people have not.

Thus they put the same amount of effort into the same things (ephemera vs permanence).

agumonkey · on Feb 23, 2018

True, it's just amplified to a rare scale.

It just examplifies how marketing is pervasive no matter the field.

WheelsAtLarge · on Feb 24, 2018

Very true, the internet has a memory problem. If you try to research something that happened, let's say 10 years ago, by using just the Internet you'll have lots of trouble. If it wasn't printed in one of the major newspapers you're out of luck. Not only that but you'll need to go through a truck load of data if you want to expand on what the newspapers have to say. I suspect that eventually, no one will be able to look back at the net's past since cryptography is becoming a must for everything net. If you're reading this and you encrypt your data, think about the data 10 years ago that you encrypted. Can you get to it? Maybe. Think about someone else's. I bet you can't. What happened 10 years is blurry now. If that's the case now then what will happen 100 years from now.

ggm · on Feb 23, 2018

SRI-NIC wiped all the WHOIS NIC handles when they handed the job on to somebody else. Network Solutions I think. So, my 1985 NIC handle got wiped by the dotcom boom sometime between 1992 and 1998.

Stuff been dyin' on the internet since forever.

awgneo · on Feb 23, 2018

I am so glad the article mentions IPFS and the great potential it has to solve this very issue.

titzer · on Feb 23, 2018

Diffusion of responsibility is no responsibility at all.

We need to have a well-funded, well-run, serious effort at preserving digital history.

majewsky · on Feb 23, 2018

https://archive.org/donate/

vog · on Feb 23, 2018

Archive.org is a great project that one can't recommend highly enough.

It's not "just" collecting huge amounts of data, it is a living archive. For example, when they publish old Amiga software, they publish it ready to use. And I don't just mean polished disk image files. I mean ready to use right in the browser. Click on a ancient game, and play it immediately in your browser!

That's exactly the kind of attitude towards archiving that we need. Like one of the more modern museums where you are allowed to touch (and interact with!) things.

TazeTSchnitzel · on Feb 23, 2018

I don't think IPFS and the like can solve the archival problem. They rely on the willingness of lots of people to donate lots of storage space indefinitely.

alwillis · on Feb 23, 2018

I don’t get why people think IPFS is only a volunteer thing. The same way bitcoin quickly evolved from people mining on laptops to professionally run datacenters, so to will providing resources for IPFS.

Filecoin will allow people to monetize making content on IPFS available, with the ability to prove the content meets certain availability guarantees via smart contracts.

Making content available via IPFS will become the new way to mine cryptocurrency.

It’s all in the white paper: https://filecoin.io/filecoin.pdf

zaarn · on Feb 23, 2018

IPFS will still be a volunteer thing. Filecoin merely allows you to pay volunteers for specific pieces of data.

But there are other protocols incoming too, like Swarm for Ethereum, Storj, etc. Some of them already available just like IPFS.

I doubt IPFS will automatically win this because it was the first to make a glorified BitTorrent client available via HTTP.

alwillis · on Feb 23, 2018

But there are other protocols incoming too, like Swarm for Ethereum, Storj, etc. Some of them already available just like IPFS.

Not an apples-to-apples comparison—IPFS is the only one that’s an offline first, peer-to-peer, distributed versioned file system. Swarm is interesting but it’s for small storage for smart contracts; it’s not a general purpose, low-cost storage option for storing terabytes of data. You’re not going to take a snapshot of Wikipedia on it, for example: https://ipfs.io/blog/24-uncensorable-wikipedia/

I doubt IPFS will automatically win this because it was the first to make a glorified BitTorrent client available via HTTP.

This statement doesn’t make sense—IPFS is designed to replace HTTP, not run on top of it. While it shares some similarities to BitTorrent like using a DHT for content addressing, it’s really a different thing.

They didn’t raise $257 million from their ICO and VCs to pay volunteers using consumer-grade equipment; they’re clearing looking to disrupt the cloud storage market: https://www.coindesk.com/257-million-filecoin-breaks-time-re...

zaarn · on Feb 26, 2018

>Swarm is interesting but it’s for small storage for smart contracts; it’s not a general purpose, low-cost storage option for storing terabytes of data.

Swarm is intended to do exactly that.

IPFS, atm, does not store terabytes of data. If I were to just dump in my data, it would be unusable within the hour as there is no incentive for nodes to keep those terabytes active and around.

> You’re not going to take a snapshot of Wikipedia on it, for example: https://ipfs.io/blog/24-uncensorable-wikipedia/

Swarm already hosts static websites, snapshots of wikipedia are feasible.

>IPFS is designed to replace HTTP, not run on top of it. While it shares some similarities to BitTorrent like using a DHT for content addressing, it’s really a different thing.

IPFS does not address dynamic content properly, IPNS is way to slow to allow websites on the scale of google to operate sensible. I doubt IPNS could handle a decently sized subreddit in terms of activity.

IPFS is unlikely to replace HTTP since both protocols address different problems. However, as it is usable today, IPFS is little more than a cache that can store some data for a bit until nobody is interested in it.

>They didn’t raise $257 million from their ICO and VCs to pay volunteers using consumer-grade equipment; they’re clearing looking to disrupt the cloud storage market: https://www.coindesk.com/257-million-filecoin-breaks-time-re....

Last I recall, Filecoin is not IPFS, it merely works on top of IPFS. You may revie your argument and replace every occurence of "IPFS" with "Filecoin", in which case it would still compete with Swarm, Storj, etc.

vog · on Feb 23, 2018

> I doubt IPFS will automatically win this because it was the first to make a glorified BitTorrent client available via HTTP.

This may be true.

On the other hand, I wouldn't be surprised if IPFS would win for exactly that reason.

amelius · on Feb 23, 2018

Computing and storage capabilities can grow (practically) without limits.

Human information consumption, however, has limits.

Therefore, I don't see the problem.

agumonkey · on Feb 23, 2018

I wonder if archive.org analyses the data to see what's "useless" or not

_lhlo · on Feb 23, 2018

Yeah, duh.

I don't think the internet was ever supposed to be forever. Sometimes it looks like it's forever because people copy content from one public place to the other but this behavior is not inevitable.