LetsEncrypt Certificate Issuance Halted

ddtaylor · on Nov 25, 2021

I'm fine with the outage LetsEncrypt overall has been great and they should take their time fixing whatever is wrong.

can16358p · on Nov 25, 2021

Yup. Great free service. Also this practically would only disrupt new registrations as there is more than enough time window for renewals anyway.

sebiw · on Nov 25, 2021

This is why most Let's Encrypt clients start renewal some x days before certificate expiration. 'Sall good. ;-)

SahAssar · on Nov 25, 2021

I think by default it is 30 days (at least for acme.sh), so you have renewal every two months and a month to fix any issues.

PikachuEXE · on Nov 25, 2021

And I generally do it every month in case there are some unexpected issue.

1 more month for me to fix stuff (or wait for fix).

melgafin · on Nov 25, 2021

acmed uses 3 weeks (28 days), seems reasonable as well.

zekica · on Nov 25, 2021

Sorry to nitpick, but 3 weeks are 21 days.

withinboredom · on Nov 25, 2021

I guess that depends on how you count. If starting from 0, it is 3, or if you don’t count the last week as a full week, it’s also 3.

yunohn · on Nov 25, 2021

I’m not sure where you’re from; I’ve lived in various continents - nobody starts counting things like “weeks” from 0.

cyberge99 · on Nov 25, 2021

Birthdays are counted from zero. When you “turn” one, you’ve completed that year.

asalahli · on Nov 25, 2021

That's because "birthdays" we celebrate are technically birthday anniversaries. Hence starting from 1 year after the actual birth day

faeyanpiraat · on Nov 25, 2021

5/7 explanation

throwaway472927 · on Nov 25, 2021

If you start weeks at 0, you should do the same for days, 3 weeks = 27 days

pxndxx · on Nov 25, 2021

A week is 7 days. Three weeks is 3x7 = 21 days.

withinboredom · on Nov 25, 2021

Also, calendars aren’t exactly the same everywhere. If you say “let’s work out sat and sun every other week”, in the US this results in a calendar where you have every other weekend off. On an EU calendar, this results in working out every week. Or something like that, it was pretty entertaining when my friend pointed out how subtly shifting the calendar to start on a different day results in a very different work out plan.

But back to counting. In the EU, if you go to the first floor, you’d call that the second floor in the US. It has one floor in the EU, but two in the US. I have no idea if this is the same thing or just a mistake. But there are some interesting assumptions in this thread!

dubcanada · on Nov 25, 2021

3 weeks is pretty well known as 21 days.

Just because elevators and other things start at a 0 doesn't mean that makes sense anywhere else, it's fairly common for ground level to be 0. There is no instance where it makes sense to say 0 week = 7 days, 1 weeks = 14 days, etc. It does however make sense to say I live on the ground level (aka floor 0), I live on the 1st floor (aka second floor).

zinekeller · on Nov 25, 2021

I'll be honest, this is still better than some more 'professional' CA issuers which sometimes just stops for a whole day. I hope that day is spent on audits and not like because their update regime doesn't support on-the-fly (or virtually on-the-fly by having two or more signing machines) updates.

geocrasher · on Nov 25, 2021

Lest anyone think that such issues only happen to free providers, check out Sectigo's status page:

https://sectigo.status.io/pages/history/5938a0dbef3e6af26b00...

For context, Sectigo also provides freebies for cPanel customers.

9dev · on Nov 25, 2021

They have, like, an incidence every two days? That's utterly disturbing, does anyone actually pay them?

technion · on Nov 25, 2021

For political reasons we buy a tonne of their certificates. It's not uncommon that I'll paste a CSR into the interface and just get a quality error like "Unhandled Exception", which basically tells me that they fixed it. Because back when it was Comodo I used to see full page stack traces.

Xylakant · on Nov 25, 2021

The blue ones are planned maintenance and reading the contents indicates that it's mostly changes and fixes for the customer-facing UI. It may be a bit excessive to post all of these, but they certainly don't indicate any sort of problematic issue with their software.

LeoPanthera · on Nov 25, 2021

Already restarted, was unavailable for 29 minutes. At the time of writing, performance is degraded.

wyrm · on Nov 25, 2021

... for less than half an hour.

cheeze · on Nov 25, 2021

Spoke too soon? Looks like they halted again.

eloeffler · on Nov 25, 2021

Looks like they continued again

abracadaniel · on Nov 25, 2021

It still shows as degraded performance overall, so will probably be intermittent for a while. Pretty cool that they’re providing that level of transparency actually.

cpach · on Nov 25, 2021

AFAICT users of Caddy would not have been affected since Caddy can fallback from one CA to another. Pretty clever!

https://caddyserver.com/docs/automatic-https#overview

ipiz0618 · on Nov 25, 2021

Wow the title got me worried. Luckily it's an outage not a shutdown.

bennyp101 · on Nov 25, 2021

I guess this really only affects those wanting to get new certificates for new (sub)domains.

For renewals, this is not a problem unless it's down for an extended period of time - and even then there would be time to switch providers. Should be using scheduled updates, and even if not, the email notifications come in on 20 and 10 days, so plenty of time to go and get it renewed.

pulse7 · on Nov 25, 2021

I like Let's Encrypt's free certificates! But I don't like centralization where failure in a centralized service may render millions of websites inaccessible... It is somehow against the spirit of the "inter-net" where many independent networks and computers are connected and work even if some fail...

tlamponi · on Nov 25, 2021

ACME is a standardized protocol (RFC8555) and there are more providers than Let's Encrypt, and you can switch transparently. That combined with the standard procedure of renewing a few weeks before the expiration date lets one handle even a total failure rather nicely.

Some other ACME providers I know of:

- ZeroSSL.com

- BuyPass.com

- SSL.com

(most of those provide free certs in some form, but some with limitations and may then ask for money if you want more features).

https://datatracker.ietf.org/doc/html/rfc8555

kouteiheika · on Nov 25, 2021

> and you can switch transparently

> ZeroSSL.com

I kinda wish people would stop recommending them. This might have changed, but last time I tried ZeroSSL (~a year ago) it was not RFC 8555 compliant (specifically section 7.3.1), and you were basically supposed to use their own proprietary API to deal with the issue. So you can't always switch transparently.

If you need an alternative use Buypass. Also free, and they're actually RFC 8555 compliant.

tlamponi · on Nov 25, 2021

FWIW: I'm not recommending anything, just listing a few providers that claim to be ACME conform, to be specific I took the list from the acme.sh:

https://github.com/acmesh-official/acme.sh#supported-ca

But it seems that acme.sh got bought by zero ssl, which would explain that it's their default now..

Out of honest interest, where did they fail to honor "7.3.1 Finding an Account URL Given a Key"?

kouteiheika · on Nov 25, 2021

> Out of honest interest, where did they fail to honor "7.3.1 Finding an Account URL Given a Key"?

Well... it doesn't work. Let me quote the RFC:

> If the server receives a newAccount request signed with a key for which it already has an account registered with the provided account key, then it MUST return a response with status code 200 (OK) and provide the URL of that account in the Location header field.

With ZeroSSL you could only call `newAccount` once; any subsequent call will fail, while according to the RFC it should return the URL of the account. So you have to either a) use their proprietary API to recover the URL (I sent them a bug report for this and that's what they basically told me), or b) save the URL along with the account key (which you don't have to do for any other ACME provider).

iostream23 · on Nov 25, 2021

Buypass saved the day last month. We were suddenly not working in older browsers due to rhe expired root cert, and just one server flag on the certbot invocation and now we are good until 2040. Yay ACME protocol and yay Buypass

gsich · on Nov 25, 2021

Whats the issue with that? I use ZeroSSL and never had to use their non-ACME-API.

pronoiac · on Nov 25, 2021

Switching to ZeroSSL helped me when some clients failed to handle the root cert switchover did Let's Encrypt at, what, the end of September? The ZeroSSL root goes waaay back: https://help.zerossl.com/hc/en-us/articles/360058294074-Zero...

jraph · on Nov 25, 2021

> you can switch transparently

Unless you use some sort of certificate (authority) pinning.

tialaramex · on Nov 25, 2021

However, your pinning strategy, at whatever level, should have planned fallbacks. That can mean if you pin keys you have a second pinned key that exists only on a HSM in somebody's safe ready for emergencies, or in this case it means picking an extra CA and pinning them too, ready for such scenarios. If your pinning doesn't account for such things you sacrificed availability for security which is likely a bad choice.

jayflux · on Nov 25, 2021

Certs renew like a month before expiry if using the bot, so it would need to be down a long time before sites became inaccessible. (It was down for 20 mins)

There are also other services that offer this sort of thing, they’re just lesser known.

kenniskrag · on Nov 25, 2021

That's why we renew our certs more than one day in advance.

ajsnigrutin · on Nov 25, 2021

Some people renew them after someone calls them that the page is inaccessible...

...not naming names, but I can see one above the bathroom sink.

(yes yes, I know, I know...)

cblconfederate · on Nov 25, 2021

I assume most people have a cron job to do that. The thing is, if it fails for a number of consecutive times then you won't be able to renew it for a period of time IIRC.

kenniskrag · on Nov 25, 2021

The cron should run always.

They have a rate limit for the amount of renewed certs. This doesn't apply if you don't reach their servers or don't get a cert.

But if you have a problem with a CA just switch to another which supports your bot (e.g. acme compliant CAs).

tialaramex · on Nov 25, 2021

Right, if your scripts can screw up, after the CA issued a certificate, their costs are locked in, (and thus Let's Encrypt rate limits apply here) and you should make effort to be able to recover the key and certificate if you fail rather than start over.

The private key only you have, so that's the thing you most need to avoid throwing away over and over due to a bug. If you lose the certificate, that's a public document, you can just get another copy manually if necessary.

iso1631 · on Nov 25, 2021

If you renew 30 days ahead of schedule, your monitoring presumably goes red at least 2 weeks before expiry so you have plenty of notice to fix it.

zinekeller · on Nov 25, 2021

I mean it still doesn't fully solve the problem, but I've set-up mine so that it connects to both Let's Encrypt and ZeroSSL's certificate chain (with LE getting priority), and considering adding BuyPass into the mix. I know that this isn't the true solution (some proposed a DNS-based system of sending public certificates, which unfortunately can be intercepted if your zone cannot use DNSSEC because your TLD manager didn't bother them).

mschuster91 · on Nov 25, 2021

> But I don't like centralization where failure in a centralized service may render millions of websites inaccessible

Clients are encouraged to renew their certificates a couple of days prior to expiration, precisely to make sure that in the case of a disruption there is still some buffer in time to prevent expired certs being served.

marcan_42 · on Nov 25, 2021

Standard practice is to renew 30 days before expiry. This gives you plenty of time to deal with issues.

foepys · on Nov 25, 2021

No problem, since ACME is an open protocol, you can use multiple providers at the same time.

I didn't use them but apparently ZeroSSL and SSL.com issue free certs as well.

the_mitsuhiko · on Nov 25, 2021

If only there was a way to use a different CA on renewal!

sam0x17 · on Nov 25, 2021

Things are under a lot of strain today. I noticed AWS lambda went down earlier today for 4 of my clients using completely unrelated stacks in different regions, but AWS status page was all green.

b0afc375b5 · on Nov 25, 2021

Here's a series of comments discussing why status pages are unreliable:

https://news.ycombinator.com/item?id=25213817

Edit:

My main takeaway is that it goes against the human instinct of self preservation (losing one's job, opening the company to lawsuits, status pages being used against you by competitor, etc.)

daniel-s · on Nov 25, 2021

The halt happened twice, but only lasted ~25 mins each time. It was back running before the arrival of most of the people that will end up getting to this post.

ThalesX · on Nov 25, 2021

I'll be honest, this title worried me a lot more than the Facebook is down one.

chrisMyzel · on Nov 25, 2021

Anybody who's affected by this clearly is too late in renewing =)