Cloudflare is experiencing failures in its connections to hosts

dang · on Aug 30, 2020

Most comments moved to https://news.ycombinator.com/item?id=24322861.

jgrahamc · on Aug 30, 2020

This isn't a Cloudflare-specific issue. Level 3/CenturyLink are in trouble. Affecting other providers (see, for example, Fastly's Status page: https://status.fastly.com/).

@dang or another mod, would be better to link to https://puck.nether.net/pipermail/outages/2020-August/013187... as this isn't a Cloudflare issue.

1f60c · on Aug 30, 2020

The Level3 outage thread is here: https://news.ycombinator.com/item?id=24322861

J_tt · on Aug 30, 2020

Looks like it's back:

"The IP NOC with the assistance of the Operations Engineering team confirmed a routing issue to be preventing BGP sessions from establishing correctly. A configuration adjustment was deployed at a high level, and sessions began to re-establish with stability. As the change propagates through the affected devices, service affecting alarms continue to clear"

reimertz · on Aug 30, 2020

Title needs to be updated. Issues are not limited to CloudFlare but to CenturyLink and Level3

_qulr · on Aug 30, 2020

Brilliant idea to recentralize the decentralized internet.

Edit: With more information coming in, I don't think the current issue is specific to Cloudflare. FWIW I'm having zero issues in mid-US.

interrupt_ · on Aug 30, 2020

Mandatory comment on Cloudflare discussions, right?

cblconfederate · on Aug 30, 2020

shouldnt it be?

easton · on Aug 30, 2020

It’s a point, but it’s repeated ad nauseum in every thread. It’s the nature of having CDNs or cloud services in the first place. If you want to outsource your uptime, you sacrifice certain freedoms you had in exchange for hypothetically lower costs for operations.

cblconfederate · on Aug 30, 2020

a lot of people feel that the outsourcing was unnecessary for a lot of low-traffic sites, and was mainly the result of marketing pushing cloudflare to everyone. It doesn't make sense that some tiny blogs go down when there's a cloudflare outage. And it creates side effects too: e.g. cloudflare's anti-spam checks make it nearly impossible to create a functioning link fetcher/previewer unless you re a big enough site to ask for a manual exception.

shim__ · on Aug 30, 2020

Unless you plan to post links to your tiny blog on reddit

coldtea · on Aug 30, 2020

Well, for most smaller websites using something like Cloudflare means they are more decentralized, more easily, than less (e.g. instead of using their single source of failure own server).

waheoo · on Aug 30, 2020

I don't really know why we need to discuss this now but the point has nothing to do with this.

Small sites don't really need that decentralization, sure it's nice that it's easy but this isn't the problem.

The problem is that in aggregate you end up with an internet with a single point of failure.

And even if as it is now isn't a big problem, who's to say that one day it is a huge problem?

coldtea · on Aug 31, 2020

>The problem is that in aggregate you end up with an internet with a single point of failure.

And the other problem, which I'm pointing at, is that without using something like Cloudflare, you end giving yourself more points of failure (more pressure, DDoS, lack of easy load balancing, most costs and devops to implement those yourself, etc).

And each site doesn't care if 10000 others go down together -- if anything that's good, if their competitors go down for a while. They care for their own status...

>And even if as it is now isn't a big problem, who's to say that one day it is a huge problem?

I'd say periodic mass failures should inform our usage and dependance patterns of the internet so that we're not 100% dependent on it 24/7, in which case sites going down together never becomes "a huge problem".

In other words, one way to never have it be a huge problem is to make the internet perfectly decentralized (which is impossible anyway -- first because sites people care about is a power law distribution, e.g. Google, MS, Amazon, stores, app stores, etc, so if Google goes down there's a disruption to billions of people, even if millions of lesser sites are up that much fewer care about, -- and second because critical instrastructure is shared, e.g. undersea cables etc.

The other way to never have it be a huge problem is to learn and adapt to situations when sites might be done, and build resilient alternative ways of operation (analogue, if need be).

zhte415 · on Aug 30, 2020

1000000 smaller websites on CloudFlare sounds like a whole lot more centralised than 1000 smaller independent solutions.

k__ · on Aug 30, 2020

Aren't their services decentralized?

McDyver · on Aug 30, 2020

I think the parent means centralizing in one single company

eskaytwo · on Aug 30, 2020

But if the issue is with transit (as it appears to be) and Cloudflare peer with multiple transit providers - how is it worse? Of course in other circumstances a single dominant player may be an issue - but it doesn’t look to be the issue here - and if anything their level of peering would allow faster recovery than other smaller content hosts (used in the loosest sense) which may have less peering.

eV6ahne6bei · on Aug 30, 2020

Centralizing connectivity over cloudflare makes the impact worse.

Furthermore, centralizing services over a handful cloud providers only encourages reliance on few large carriers.

waheoo · on Aug 30, 2020

But seemingly contrarian views get upvotes.

johannes1234321 · on Aug 30, 2020

What is better: If only I go down and my competitors are still up, or if everybody goes down? ;)

cblconfederate · on Aug 30, 2020

If your competitors go down and you re still up.