Zoom.us is down

maest · on June 21, 2022

https://status.zoom.us is green, even though service is down/severely degraded.

juice_bus · on June 21, 2022

What use are status pages when you need to get VP+ clearance to acknowledge an outage? We see this time and time again with major services.

It is a shame that we have to find out about outages on HN.

toomuchtodo · on June 21, 2022

> What use are status pages

Marketing and sales. Not /s

We are in dire need of something crowdsourced, or where someone like DataDog or other telemetry systems offer you the ability to share non sensitive metrics publicly for various cloud or SaaS systems that they publish.

Edit: y’all are amazing with these monitoring tools!

CoastalCoder · on June 21, 2022

Does downdetector.com meet that need?

At least for Steam, I've found it pretty useful.

toomuchtodo · on June 21, 2022

For public facing endpoints or front ends, yes. For complex systems where you’d need sensors inside (AWS, GCP, anything IaaS or PaaS, etc), no.

Somewhat but not entirely similar to BGP looking glass systems.

geocrasher · on June 21, 2022

You mean like https://mcbroken.com/ ?

WarOnPrivacy · on June 21, 2022

Oooooo. OpenStreetMap has clearly visible county lines. I'm in lust.

i67vw3 · on June 21, 2022

Downdetector was itself down today for half an hour when the cloudflare incident happened.

lamontcg · on June 21, 2022

> Marketing and sales. Not /s

When I got hired at Amazon in 2001 we had a "gonefishin" page that was a static page that would be served in the event of an outage (this was before status pages, but it was kind of the same thing -- public acknowledgement of a major incident). The standard protocol was within minutes of a sev 1 to make a decision to display the GF page once it was confirmed that the whole site was down and then work to fix the issue.

By the time I left in 2006 that was no longer policy since reporters had setup monitoring for that page to detect outages and report on service availability so they just let it crash and return 500s or whatever the failure mode was. Optimize for making the job of external agencies doing reporting on their availability harder instead of easier.

mbesto · on June 21, 2022

> We are in dire need of something crowdsourced, or where someone like DataDog or other telemetry systems offer you the ability to share non sensitive metrics publicly for various cloud or SaaS systems that they publish.

This is literally what I'm building right now. See reply above: https://news.ycombinator.com/item?id=31825239

Shoot me an email if anyone is interested in getting beta access.

austinpena · on June 21, 2022

Built something like that at https://taloflow.ai/is-aws-down

Looks like there’s been some errors too

cmccart · on June 21, 2022

New Relic has had this for a few years now for a couple hundred of the most requested domains: https://docs.newrelic.com/docs/query-your-data/explore-query...

Disclaimer - I work at New Relic but not on this.

jmartens · on June 21, 2022

That's exactly what we are building at https://metrist.io/

jabroni_salad · on June 21, 2022

I use uptimerobot to monitor a lot of endpoints that I depend upon but don't really control. Been burned by first party status pages way too many times.

philote · on June 21, 2022

I guess one could argue it isn't an outage since it only seems to have affected a subset of users. I got on a zoom call when this issue started and we had 3 of the 4 participants. Only one couldn't connect due to the issue.

But I do agree they should be able to monitor things better and show some sort of update on their status page as soon as possible.

Wowfunhappy · on June 21, 2022

I feel like this is almost worse. It would be awful if you were the only person who couldn't connect to a high-stakes meeting. At least if it happens to everyone, it's obvious that the problem is on Zoom's end.

CaptainZapp · on June 21, 2022

No problem for us.

We use Skype for Business, which is so flaky at times that the default assumption if somebody is not joining is that the system conked on her.

danachow · on June 21, 2022

If you stake your life’s happiness on pleasing morons (and the morons in this case are those that pretty much don’t immediately assume technical problems out of your control) - you’re pretty much guaranteed a bad time.

Wowfunhappy · on June 21, 2022

A couple of months ago, I finally landed a first-round job interview at a place where I've wanted to work for several years. The interview was conducted over Zoom.

What would have happened if Zoom had worked fine on their end, but I was randomly unable to connect? Perhaps it would have been fine—they would have been understanding, and we would have rescheduled for another day. Perhaps if they hadn't been understanding, I shouldn't have wanted to work for them anyway.

But, I don't know. I wanted to work for them, and I was competing with other candidates who presumably interviewed on different days. Hiring processes are inherently imperfect, and lots of things can be consciously or unconsciously treated as a red flag.

(And yes, lots of other things could have happened on the day of the interview. But I still find this scenario particularly scary to think about.)

danachow · on June 21, 2022

> Hiring processes are inherently imperfect, and lots of things can be consciously or unconsciously treated as a red flag.

Exactly, so it’s weird to worry about a Zoom problem in particular. If anything it’s a little better now since most people are conditioned to think of technical problems as less likely the affected persons fault (that’s why I referred to the alternative as “morons”) - even if you left yourself plenty of time and did everything right and public transit fucked you over it was never a good look.

mkl95 · on June 21, 2022

Whether you consider it an outage or not seems to be a political / PR thing these days. I used to work on a SaaS that relied on a handful of big customers to make payroll. If their favourite stuff stopped working, hell ensued. On the other hand Atlassian pretended nothing was going on for a while recently, because they could afford to lose 400 customers.

reaperducer · on June 21, 2022

I guess one could argue it isn't an outage since it only seems to have affected a subset of users

If an electric company serving a million people leaves 100 of them in the dark, it's still an "outage."

Why give a free pass to Zoom? Because it's a tech company, and we've been trained to accept failures as the cost of admission?

interestica · on June 21, 2022

So for you it was only a 25% outage?

no_wizard · on June 21, 2022

This has made me realize why companies like pingdom have a business. I've always wondered, in the sense that I couldn't quite understand why you'd pay for someone just to ping things and alert you of outages (this was early in my career)

But over the last 4 years specifically I not only understand it I can't imagine not having a service like it.

Disclaimer: I don't work for pingdom and my current company doesn't use their services, I have in the past, they're pretty good, but I'm just using them as an example here

jmartens · on June 21, 2022

If you like pingdom, you'll love what we are building at Metrist https://metrist.io/

mbesto · on June 21, 2022

I'm actually building a product to solve this. If anyone is interested in beta testing, we should be rolling this out in 2~3 weeks. Shoot me an email: mbesto @ gmail service

gjsman-1000 · on June 21, 2022

Is it so hard? At my company, I set up a status page linked to a pinging service which automatically pings various endpoints every 5 minutes (as well as our 3rd-party dependencies) and automatically flags any problems if a ping does not respond.

The pinging service and status page, at our scale, is free and our status page is actually useful and automated for 90% of our stuff.

cortesoft · on June 21, 2022

That works for a website, but doesn’t work for all outage types with a service like zoom. If the website/api is responding, but you can’t create new calls, for example, the outage wouldn’t be detected.

phaer · on June 21, 2022

I think the problem might be less on the technical and more on the business-side of things.

Status pages that raise customers confidence in your service are good from a marketing perspective.

Automatically publishing uptime data without human review might be bad from a marketing perspective, if you don't trust the engineering department to actually deliver or if your service depends on too many external dependencies.

treeman79 · on June 21, 2022

I’ve come across a lot of websites that respond to health checks just fine. Because the page health check doesn’t hit the database or other services.

_fjb4 · on June 21, 2022

I do this at home using LMNS, very useful to detect service failures as well as latency spikes. I ping my upstream ISP router, Google, Cloudflare, my DNS provider, and several others.

Note that as another commenter said this only tells if the server is up, not if the service is working properly or not. In this case it won't work since the Zoom website is loading fine but the meetings don't work properly.

mason55 · on June 21, 2022

The update lag has nothing to do with technical abilities. It's about things like marketing and contractual SLAs.

If acknowledging downtime causes you to violate an SLA and pay a bunch of penalties then you don't want some automated script to trigger it.

gjsman-1000 · on June 21, 2022

> violate an SLA and pay a bunch of penalties

The SLA would not depend on what the status page reported, but the actual downtime. If the script malfunctioned, you wouldn't need to pay out because it wasn't actual downtime contractually - and if it was actual downtime, I guess it makes it harder to squeeze around but that's only if someone is carefully taking legally admissible evidence from the first minute the status page reports (which screenshots alone don't always meet).

kull · on June 21, 2022

Service status pages are so useless.

lxchase · on June 21, 2022

At this point, I just default to downdetector

WarOnPrivacy · on June 21, 2022

It's for nostalgia - the one bit of early web that's still unchanged.

boplicity · on June 21, 2022

We're hosting a live event in 3 hours -- expecting 1,000 people to show up. Hopefully things will be working by then!

ta988 · on June 21, 2022

You have 3 hours to setup an alternative. That's doable if you have the emails.

wantguns · on June 21, 2022

https://meet.element.io/randommeetname

whoomp12342 · on June 21, 2022

but a lot of embarrassing noise if it would work just fine in 3 hours. tough call. I suggest using an IRC channel for the event instead.

freemint · on June 21, 2022

Please what? Who is your audience that more then 20% of them could reliably run an IRC client and join the right server?

JellyBeanThief · on June 21, 2022

I believe that was a joke.

gberger · on June 21, 2022

It's even more embarrassing if it ends up not working. Better safe than sorry.

whoomp12342 · on June 21, 2022

wooooooooooooooooooooooosh

danuker · on June 21, 2022

You can host a video stream (but make it low-def; 1000 people x 1MBPS = 1GBPS)

https://obsproject.com/

Alternatively use a centralized service (YouTube, Twitch).

Or, there's also the P2P Jami (but this requires the participants to install something):

https://jami.net/

kypro · on June 21, 2022

What's the quick alternative out of interest? And is it free?

xigoi · on June 21, 2022

I'd say Jitsi, though I'm not sure how it handles 1000 people.

bzxcvbn · on June 21, 2022

It doesn't.

freemint · on June 21, 2022

It can. If the jitsi has proper videobridges setup it can.

DoctorOW · on June 21, 2022

My day job is doing the same size event over Zoom at the same time...

What a weird coincidence.

AlecSchueler · on June 21, 2022

Hooray for centralisation of infrastructure!

jeromegv · on June 21, 2022

Not the best example as there's likely a dozen service someone could switch to. People choose Zoom mostly because they like it, not by lack of choice.

xigoi · on June 21, 2022

It's the host that chooses the platform, participants don't get to choose.

whoomp12342 · on June 21, 2022

well there should be more competitors I actually like.

geraldwhen · on June 21, 2022

Businesses aren’t willing to pay, and thus they don’t exist.

noirbot · on June 21, 2022

They aren't? Every company I've worked at pays through the nose for one of GTM/Zoom/BlueJeans/Lifesize/Chime. Zoom is definitely the least bad of them in my experience, not that I like it at all. Lack of customers willing to pay doesn't seem to have stopped there from being a bounty of terrible alternatives.

practice9 · on June 21, 2022

I thought WebRTC solved some of these problems, but seems like peer discovery would still need to be centralised in some way. Are there any protocols that would solve this? (excluding email or blockchain both of which would require additional extensions or standards to create a solution with smooth UX)

avrionov · on June 21, 2022

Unfortunately WebRTC provides only part of the features needed to build Web Conferencing app. The main issue is that p2p connections don't work for more 5 attendees in the same session. After that is better to have MCU or SFU.

Other features like recording a much more reliable if they are implemented server side.

whoomp12342 · on June 21, 2022

isnt it funny how we love decentralization in the cloud but that really just consolidates our resources to 3 main companies?

like, yes I get it multi regions makes you safer, but given vendor lock if aws fucks up across the board with a bug, we all fuck up across the board.

wdb · on June 21, 2022

Going to be a productive day :)

brink · on June 21, 2022

Probably not; teams will spend more time trying to use an alternative

tiffanyh · on June 21, 2022

I've been on nonstop zoom calls all morning (last 5 hours), no issues.

xwowsersx · on June 21, 2022

Sorry to hear. We're here for you.

bastardoperator · on June 21, 2022

Same, not seeing any issues.

bombcar · on June 21, 2022

Hmm going to https://us02web.zoom.us/join and clicking a join link directly worked fine 30 minutes ago.

lndevops · on June 21, 2022

Status pages lie to us/are generally inaccurate for a bunch of reasons. Here's just a few of them: https://metrist.io/blog/why-status-pages-are-lying-to-you-an...

lachtan · on June 21, 2022

Same for us, we're hosting 3 events right now, neither seems to be working, nor can we log in to their website.

morpheuskafka · on June 21, 2022

At least one Zoom Enterprise splash page is showing the generic nginx error message, which initially made me think it was just our tenant since the other subdomains I tried would load the login screen. But sounds like that is not the case.

flatiron · on June 21, 2022

im on a zoom right now, works fine....

bombcar · on June 21, 2022

Don't close the bridge! I've had them continue to work as long as you didn't close it when there's an authentication, etc error.

duxup · on June 21, 2022

>Don't close the bridge!

Unless you don't want there to be a meeting(s)... just say'n.

bombcar · on June 21, 2022

Your lack of dedication to Sparkle Meetings has been noted.

dangus · on June 21, 2022

Close the bridge, then when it doesn't work you have a valid excuse not to work.

vaylian · on June 21, 2022

or a valid excuse to actually get work done instead of wasting time in meetings.

dangus · on June 21, 2022

In the employer employee relationship, the employer is responsible for providing functioning tools.

Maintaining productivity when those tools cease to function encourages the employer to underinvest in tools and burden the employee.

boplicity · on June 21, 2022

Maybe it's just the website, login, links, etc that are broken? I can't log in and can't open Zoom links.

300bps · on June 21, 2022

I'm on my fourth Zoom for the past 3 hours and have not had a single problem at all today.

dancemethis · on June 21, 2022

Hopefully forever, one less privacy-hostile proprietary thingie around.

slenk · on June 21, 2022

Because the other corporate solutions are better? I will take Zoom over Teams/Google

ChrisMarshallNY · on June 21, 2022

A lot of folks I know, like BlueJeans[0].

I have not used it, myself, so I can't report on its effectiveness.

[0] https://www.bluejeans.com

dijonman2 · on June 21, 2022

I think Google is a better option especially if you have corp gmail. I am not a fan of o365/teams so I do agree with you there. Just my grievances are rooted in UX