Hacker News new | past | comments | ask | show | jobs | submit login
Zoom.us is down (zoom.us)
146 points by rkwasny on June 21, 2022 | hide | past | favorite | 92 comments



https://status.zoom.us is green, even though service is down/severely degraded.


What use are status pages when you need to get VP+ clearance to acknowledge an outage? We see this time and time again with major services.

It is a shame that we have to find out about outages on HN.


> What use are status pages

Marketing and sales. Not /s

We are in dire need of something crowdsourced, or where someone like DataDog or other telemetry systems offer you the ability to share non sensitive metrics publicly for various cloud or SaaS systems that they publish.

Edit: y’all are amazing with these monitoring tools!


Does downdetector.com meet that need?

At least for Steam, I've found it pretty useful.


For public facing endpoints or front ends, yes. For complex systems where you’d need sensors inside (AWS, GCP, anything IaaS or PaaS, etc), no.

Somewhat but not entirely similar to BGP looking glass systems.


You mean like https://mcbroken.com/ ?


Oooooo. OpenStreetMap has clearly visible county lines. I'm in lust.


Downdetector was itself down today for half an hour when the cloudflare incident happened.


> Marketing and sales. Not /s

When I got hired at Amazon in 2001 we had a "gonefishin" page that was a static page that would be served in the event of an outage (this was before status pages, but it was kind of the same thing -- public acknowledgement of a major incident). The standard protocol was within minutes of a sev 1 to make a decision to display the GF page once it was confirmed that the whole site was down and then work to fix the issue.

By the time I left in 2006 that was no longer policy since reporters had setup monitoring for that page to detect outages and report on service availability so they just let it crash and return 500s or whatever the failure mode was. Optimize for making the job of external agencies doing reporting on their availability harder instead of easier.


> We are in dire need of something crowdsourced, or where someone like DataDog or other telemetry systems offer you the ability to share non sensitive metrics publicly for various cloud or SaaS systems that they publish.

This is literally what I'm building right now. See reply above: https://news.ycombinator.com/item?id=31825239

Shoot me an email if anyone is interested in getting beta access.


Built something like that at https://taloflow.ai/is-aws-down

Looks like there’s been some errors too


New Relic has had this for a few years now for a couple hundred of the most requested domains: https://docs.newrelic.com/docs/query-your-data/explore-query...

Disclaimer - I work at New Relic but not on this.


That's exactly what we are building at https://metrist.io/


I use uptimerobot to monitor a lot of endpoints that I depend upon but don't really control. Been burned by first party status pages way too many times.


I guess one could argue it isn't an outage since it only seems to have affected a subset of users. I got on a zoom call when this issue started and we had 3 of the 4 participants. Only one couldn't connect due to the issue.

But I do agree they should be able to monitor things better and show some sort of update on their status page as soon as possible.


I feel like this is almost worse. It would be awful if you were the only person who couldn't connect to a high-stakes meeting. At least if it happens to everyone, it's obvious that the problem is on Zoom's end.


No problem for us.

We use Skype for Business, which is so flaky at times that the default assumption if somebody is not joining is that the system conked on her.


If you stake your life’s happiness on pleasing morons (and the morons in this case are those that pretty much don’t immediately assume technical problems out of your control) - you’re pretty much guaranteed a bad time.


A couple of months ago, I finally landed a first-round job interview at a place where I've wanted to work for several years. The interview was conducted over Zoom.

What would have happened if Zoom had worked fine on their end, but I was randomly unable to connect? Perhaps it would have been fine—they would have been understanding, and we would have rescheduled for another day. Perhaps if they hadn't been understanding, I shouldn't have wanted to work for them anyway.

But, I don't know. I wanted to work for them, and I was competing with other candidates who presumably interviewed on different days. Hiring processes are inherently imperfect, and lots of things can be consciously or unconsciously treated as a red flag.

(And yes, lots of other things could have happened on the day of the interview. But I still find this scenario particularly scary to think about.)


> Hiring processes are inherently imperfect, and lots of things can be consciously or unconsciously treated as a red flag.

Exactly, so it’s weird to worry about a Zoom problem in particular. If anything it’s a little better now since most people are conditioned to think of technical problems as less likely the affected persons fault (that’s why I referred to the alternative as “morons”) - even if you left yourself plenty of time and did everything right and public transit fucked you over it was never a good look.


Whether you consider it an outage or not seems to be a political / PR thing these days. I used to work on a SaaS that relied on a handful of big customers to make payroll. If their favourite stuff stopped working, hell ensued. On the other hand Atlassian pretended nothing was going on for a while recently, because they could afford to lose 400 customers.


I guess one could argue it isn't an outage since it only seems to have affected a subset of users

If an electric company serving a million people leaves 100 of them in the dark, it's still an "outage."

Why give a free pass to Zoom? Because it's a tech company, and we've been trained to accept failures as the cost of admission?


So for you it was only a 25% outage?


This has made me realize why companies like pingdom have a business. I've always wondered, in the sense that I couldn't quite understand why you'd pay for someone just to ping things and alert you of outages (this was early in my career)

But over the last 4 years specifically I not only understand it I can't imagine not having a service like it.

Disclaimer: I don't work for pingdom and my current company doesn't use their services, I have in the past, they're pretty good, but I'm just using them as an example here


If you like pingdom, you'll love what we are building at Metrist https://metrist.io/


I'm actually building a product to solve this. If anyone is interested in beta testing, we should be rolling this out in 2~3 weeks. Shoot me an email: mbesto @ gmail service


Is it so hard? At my company, I set up a status page linked to a pinging service which automatically pings various endpoints every 5 minutes (as well as our 3rd-party dependencies) and automatically flags any problems if a ping does not respond.

The pinging service and status page, at our scale, is free and our status page is actually useful and automated for 90% of our stuff.


That works for a website, but doesn’t work for all outage types with a service like zoom. If the website/api is responding, but you can’t create new calls, for example, the outage wouldn’t be detected.


I think the problem might be less on the technical and more on the business-side of things.

Status pages that raise customers confidence in your service are good from a marketing perspective.

Automatically publishing uptime data without human review might be bad from a marketing perspective, if you don't trust the engineering department to actually deliver or if your service depends on too many external dependencies.


I’ve come across a lot of websites that respond to health checks just fine. Because the page health check doesn’t hit the database or other services.


I do this at home using LMNS, very useful to detect service failures as well as latency spikes. I ping my upstream ISP router, Google, Cloudflare, my DNS provider, and several others.

Note that as another commenter said this only tells if the server is up, not if the service is working properly or not. In this case it won't work since the Zoom website is loading fine but the meetings don't work properly.


The update lag has nothing to do with technical abilities. It's about things like marketing and contractual SLAs.

If acknowledging downtime causes you to violate an SLA and pay a bunch of penalties then you don't want some automated script to trigger it.


> violate an SLA and pay a bunch of penalties

The SLA would not depend on what the status page reported, but the actual downtime. If the script malfunctioned, you wouldn't need to pay out because it wasn't actual downtime contractually - and if it was actual downtime, I guess it makes it harder to squeeze around but that's only if someone is carefully taking legally admissible evidence from the first minute the status page reports (which screenshots alone don't always meet).


Service status pages are so useless.


At this point, I just default to downdetector


It's for nostalgia - the one bit of early web that's still unchanged.


We're hosting a live event in 3 hours -- expecting 1,000 people to show up. Hopefully things will be working by then!


You have 3 hours to setup an alternative. That's doable if you have the emails.



but a lot of embarrassing noise if it would work just fine in 3 hours. tough call. I suggest using an IRC channel for the event instead.


Please what? Who is your audience that more then 20% of them could reliably run an IRC client and join the right server?


I believe that was a joke.


It's even more embarrassing if it ends up not working. Better safe than sorry.


wooooooooooooooooooooooosh


You can host a video stream (but make it low-def; 1000 people x 1MBPS = 1GBPS)

https://obsproject.com/

Alternatively use a centralized service (YouTube, Twitch).

Or, there's also the P2P Jami (but this requires the participants to install something):

https://jami.net/


What's the quick alternative out of interest? And is it free?


I'd say Jitsi, though I'm not sure how it handles 1000 people.


It doesn't.


It can. If the jitsi has proper videobridges setup it can.


My day job is doing the same size event over Zoom at the same time...

What a weird coincidence.


Hooray for centralisation of infrastructure!


Not the best example as there's likely a dozen service someone could switch to. People choose Zoom mostly because they like it, not by lack of choice.


It's the host that chooses the platform, participants don't get to choose.


well there should be more competitors I actually like.


Businesses aren’t willing to pay, and thus they don’t exist.


They aren't? Every company I've worked at pays through the nose for one of GTM/Zoom/BlueJeans/Lifesize/Chime. Zoom is definitely the least bad of them in my experience, not that I like it at all. Lack of customers willing to pay doesn't seem to have stopped there from being a bounty of terrible alternatives.


I thought WebRTC solved some of these problems, but seems like peer discovery would still need to be centralised in some way. Are there any protocols that would solve this? (excluding email or blockchain both of which would require additional extensions or standards to create a solution with smooth UX)


Unfortunately WebRTC provides only part of the features needed to build Web Conferencing app. The main issue is that p2p connections don't work for more 5 attendees in the same session. After that is better to have MCU or SFU.

Other features like recording a much more reliable if they are implemented server side.


isnt it funny how we love decentralization in the cloud but that really just consolidates our resources to 3 main companies?

like, yes I get it multi regions makes you safer, but given vendor lock if aws fucks up across the board with a bug, we all fuck up across the board.


Going to be a productive day :)


Probably not; teams will spend more time trying to use an alternative


I've been on nonstop zoom calls all morning (last 5 hours), no issues.


Sorry to hear. We're here for you.


Same, not seeing any issues.


Hmm going to https://us02web.zoom.us/join and clicking a join link directly worked fine 30 minutes ago.


Status pages lie to us/are generally inaccurate for a bunch of reasons. Here's just a few of them: https://metrist.io/blog/why-status-pages-are-lying-to-you-an...


Same for us, we're hosting 3 events right now, neither seems to be working, nor can we log in to their website.


At least one Zoom Enterprise splash page is showing the generic nginx error message, which initially made me think it was just our tenant since the other subdomains I tried would load the login screen. But sounds like that is not the case.


im on a zoom right now, works fine....


Don't close the bridge! I've had them continue to work as long as you didn't close it when there's an authentication, etc error.


>Don't close the bridge!

Unless you don't want there to be a meeting(s)... just say'n.


Your lack of dedication to Sparkle Meetings has been noted.


Close the bridge, then when it doesn't work you have a valid excuse not to work.


or a valid excuse to actually get work done instead of wasting time in meetings.


In the employer employee relationship, the employer is responsible for providing functioning tools.

Maintaining productivity when those tools cease to function encourages the employer to underinvest in tools and burden the employee.


Maybe it's just the website, login, links, etc that are broken? I can't log in and can't open Zoom links.


I'm on my fourth Zoom for the past 3 hours and have not had a single problem at all today.


Hopefully forever, one less privacy-hostile proprietary thingie around.


Because the other corporate solutions are better? I will take Zoom over Teams/Google


A lot of folks I know, like BlueJeans[0].

I have not used it, myself, so I can't report on its effectiveness.

[0] https://www.bluejeans.com


I think Google is a better option especially if you have corp gmail. I am not a fan of o365/teams so I do agree with you there. Just my grievances are rooted in UX


Ah yes, the only three video meeting software in the world.


Jitsi and BigBlueButton are better.


Zoom is an enterprise solution, privacy is not really a problem, they don't sell ads


OP is talking about China and the CCP, I think


I think Zoom now has a separate brand for China.

I was in a Zoom call with a big Chinese company some months ago, and they used something that looked the same as zoom but had a Chinese name.

Edit:

I think it was this one:

https://www.cnbc.com/amp/2020/08/03/zoom-to-halt-direct-sale...

https://www.zhumu.com/



Zoom's Phone Service is working for us as of just now ...


Able to log in again as of 10:40am Eastern.


What a day...


so what? shit goes down once in a while. sometimes you have a cloudy day.


I can assure you I am not a cat. And I didn't crash Zoom.

Reference: https://en.wikipedia.org/wiki/Zoom_Cat_Lawyer




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: