It also sounds like a cool caching technique : since people already have the resource cached on their local system, why not allow them to distribute it?
HTTP headers can allow to know when to flush cache (just like it's currently done) and provide last known md5/sha1/whatever digest to make sure page is not tempered with (let's say it's checked when the download is complete, and retry a download if the signature does not match: it should not happen often anyway). It obviously won't work for pages which distribute auth related content, but it would be great for assets.
I guess a problem could be that page load will be slower (depends on the ability to parallelize and to contact geographically close peers, I suppose), but it would mean way less heavy load on servers.
It'd be a huge downgrade in privacy. As much as we don't like it, when I'm watching cat videos only Google knows; with bittorrent, everybody knows what you're watching. As much as I love bittorrent (I really do), this is an aspect for which I don't see easy solutions.
I realize that this would not work as well : tor is routing traffic through exit nodes, so it would basically cancel the advantages of p2p distribution.
The privacy concern makes me realize something else : there's no incentive for users here. They lose privacy, what do they win?
It does not prevent totally the idea of using p2p to distribute content as a low level implementation, but since we (the ones who manage servers) are the only ones to benefit from it, we must first find a way for it to have no impact on users.
The biggest privacy problem is because of how p2p works currently : we have a list of IPs associated to a resource. How can we obfuscate this without going through proxies?
Not "tor". "tor-like". Onion routing. Something that would make sure only your closest neighbor knows what you request, and won't know if it's for you or your closest neighbor.
I don't know. It's already hard enough to incentivize people to share content they already have (in current bittorrent), I believe it would be even more difficult to incetivize them to download content they're not primarily interested in just for a neighbor.
It works if we don't do naive P2P but rather Friend-2-Friend; this is what retroshare does, your downloads can go through your friends so that only they know what you download. But that requires a lot more steps than traditional bittorrent, so I'm not sure it could work in general.
Such networks where you "download a bunch of stuff you're not interested in for the sake of" the network already exist.
Perfect Dark (a Japanese P2P system) is a direct implementation of that concept, where you automatically "maintain ratio" as with a private torrent tracker by your client just grabbing a bunch of (opaque, encrypted) stuff from the network and then serving it.
A more friendly example, I think—and probably closer to what the parent poster is picturing—is Freenet, which is literally an onion-routed P2P caching CDN/DHT. Peers select you as someone to query for the existence of a chunk; and rather than just referring them to where you think it is (as in Kademlia DHTs), you go get it yourself (by re-querying the DHT in Kademlia's "one step closer" fashion), cache it, and serve it. So a query for a chunk results in each onion-routing step—log2(network size) people—caching the chunk, and then taking responsibility for that chunk themselves, as if they had it all along.
For traffic that consists of many small, relatively rare files (which is most HTTP traffic) you would have to do some proactive cahing anyway. I want JQuery 1.2.3. I ask your computer. Your computer doesn't have it, either because it's a rarely used version, or because you cleared your cache. Instead of returning some error code, your computer asks for it from another node, then caches the file, then returns it to my computer. This kind of stuff will be necessary to ensure high availability, and incidentally it would make it hard to see who is the original requestor. It should actually improve privacy. Also, this scheme could be used to easily detect cheating nodes that don't want to return any content. (They usually wouldn't have an excuse of not having the content you requested, so if a node consistently refuses to share, it can be eventually blacklisted.)
> I want JQuery 1.2.3. I ask your computer [...] returns it to my computer
Good ! Now I know you have JQuery 1.2.3, which is vulnerable to exploit XYZ, which I can now use to target you. This is one reason why apt-p2p and things like that can't be deployed in large; it's way too easy to know what version of what packages are installed on your machine.
Indeed, local disclosure is a cool idea to mitigate history leak.
This makes me think that an other feature could be to not disclose all peers available, but randomly select some. This would force someone wanting to look someone up to download a possible big amount of times the same list to check for an ip, instead of just pulling it once per ressource to know as a fact.
Both ideas are not exactly privacy shields, but steps to mitigate the problem.
>It also sounds like a cool caching technique : since people already have the resource cached on their local system, why not allow them to distribute it?
Sounds like a cool way to deter people from using it if you ask me.
I'm not from USA, am I correct to think Comcast is a mobile ISP? If so, this is a non issue : mobile apps can already decide not to do big update when not on wifi, browsers can do the same.
That is incorrect. Comcast is the (second?[0]) largest cable ISP in the USA. About 10 years ago, they were calling people saying that they were using too much data, but that was always nebulous and varied greatly. Then they defined a 250 GB cap in 2008, and later, 300 GB. They only recently expanded it to 1TB. Go over it for three months, and Comcast will get very angry.
I'm not sure if the cap is in effect for places with fiber internet (like Verizon FIOS). It wouldn't surprise me, since they've admitted caps are unnecessary.[1]
[0] Charter + Time Warner Cable (after merge) is larger, I think.
I see, thanks. This sounds quite terrible actually : if internet has any historical impact, the last thing you want is for people to think about "not using too much data". I hope this won't last.
Many big American ISPs are too greedy. Even worse is that most people have no clue how much a gigabyte of data is.
About the time Chairman Wheeler was aiming for net neutrality, he was also looking at data caps and if they were bad. Within a month, Comcast increased theirs to 1 TB. Coincidence?
There's going to be a lot about how 'competitive' ISPs are in the next few years in America now that everything's Republican. But most people have two options: expensive cable ISP that's fast when it wants to be, and slow DSL that's also expensive and not getting better.
Unfortunately Comcast is one of the two companies (along with Verizon) that have monopolized the ISP market in the US. Together they own most of the market, and have agreements to stay out of each others' territories. This affects something like 40% of home Internet customers in the US :(.
Even when I download non stop torrent movies I don't reach this in one month. I doubt it would be a problem, especially since the load would be balanced across all users.
Try a different quality? Best quality on amazon video is 7GB per hour, works out to about 150 hours a month. Sure this is a lot of video, but spread across a household with say 4 people with their own screens and viewing habits is under 1.5 hours per day per person
Streaming Netflix alone can cap your bandwidth quickly given the statistics from the Netflix page [1]. I have recently experienced this actually (I do not run torrents from home and during the month of December I did not transfer anything from the remote box). Month of December I used 1022 GB out of 1024 GB over 900 GB was from NetFlix.
Using: 1 terabyte = 8,388,608 megabits
Ultra HD: 25 mbps ~= 93.2 hours
HD: 5 mbps ~= 19.5 days
SD: Would be good but hey I bought an HD TV for a reason so that would not be fair to skimp out on quality just because they don't want to introduce cheaper bandwidth caps.
side note It is $10 per 50 GB once you go over the cap with Comcast which is pretty huge for overhead if a user wasn't even aware of the cap. Thankfully they cap it at $200 over your bill.
side note2 this number will only increase - so really that 1 TB cap already needs to see a lift to 2 TB just to start to fulfill the requirements of the newer age web.
Back when I was on cheap and cheerful, capped internet - if you went over the (paltry 100GB, although this was the best part of a decade ago now...) cap, I'd pay an extra £6 on my bill and that was it, regardless of how much I went over by.
Ironically, the price to remove the data cap as part of your package was more expensive than the over-cap charge.
For me the choice is cable or dsl... and dsl is pretty painfully slow, and the carrier is much less responsive to issues. Fortunately my cable provider doesn't charge much more for a business connection ($140 vs $90, compared to comcast charging 3x as much). Cox tends to respond to technical issues within hours (onsite) and not days, and I don't have any cap issues. Anything gray (tv torrents) I now use a seedbox or vpn for.
In short their approach is that instead of connecting point to point and using addresses of hosts, what if we could address based on the data we want.
This suddenly makes routers aware of what data they are forwarding, that knowledge allows them to start caching and reuse the same data packets when multiple people request the same data.
Things like multicasting or multi path forwarding are simpler. Interestingly, the more people are viewing the same video for example the better, essentially CDN is no longer needed.
Most of use cases how we currently use the Internet, are actually easier to do this way.
Things that are harder though (but not impossible to do) is point to point communication, such as SSH.
Actually, what could be done without "rebooting" the whole internet is to use caching relays, which would operate just like DNS servers do, but caching content instead of IPs (maybe it's what they're suggesting, it was unclear from general description).
EDIT : but I wonder if I'm really ready to deal, as a webdev, with content propagation based on TTL, like we do with dns zones :D
EDIT 2 : actually, this is a non issue. TTL is there because when caching something as short as a IP string, you don't want to have to issue a request to parent node. But when caching assets, it won't be that a problem to issue a HEAD http request to asset owner to ask if the cached content should be invalidated.
HTTP headers can allow to know when to flush cache (just like it's currently done) and provide last known md5/sha1/whatever digest to make sure page is not tempered with (let's say it's checked when the download is complete, and retry a download if the signature does not match: it should not happen often anyway). It obviously won't work for pages which distribute auth related content, but it would be great for assets.
I guess a problem could be that page load will be slower (depends on the ability to parallelize and to contact geographically close peers, I suppose), but it would mean way less heavy load on servers.