*Identity provider’s signing keys are probably the most powerful secrets in the ...

crote · on July 22, 2023

If these keys are so important, why are they not treated with similar security measures as CA keys?

Why is it just a handful of keys shared by dozens, hundreds, or likely even thousands of servers? Where is the HSM-backed root key, per-DC intermediate, and per-server transient signing key? Where is the certificate revocation list?

Is it even possible to secure a bearer token from an attack like this? Should we go back to a nonce which we present to the originating service to securely retrieve the payload?

jrockway · on July 22, 2023

> If these keys are so important, why are they not treated with similar security measures as CA keys?

No incentive, right? If you mess up as a CA, you go out of business because every browser drops you. But if you're an identity provider and you have an incident, maybe 1% of your customers opens an angry support ticket and you give them 5% off their bill next month. So why bother? Are YOU going to move your 100,000 employee organization off of Microsoft because of this? No? Then why would they care at all? They won the sale and you're stuck with what you get.

Here's a list of companies that have gone out of business because they didn't take information security seriously:

Here's a list of billionaires who had to sell their yachts because they didn't prioritize security over other concerns:

Yeah.

TheNewsIsHere · on July 22, 2023

I get what you’re saying, but there’s more nuance here. The smaller the business is, the likelier it is to fold within 1-1.5 years if a major compromise. I’m asserting that in the back of some FBI statistics published about BECs awhile back.

I would posit that your first list needs a conditional for “too big to fail”.

DANmode · on July 22, 2023

The US does not prioritize small business, or sole proprietorships.

Anyone even adjacent to business or finance during COVID could very clearly witness the wealth transfer, and the smaller competitors being swallowed.

TheNewsIsHere · on July 23, 2023

Did you mean to reply to another comment? I don’t see the correlation between your comment and mine. I didn’t assert the US was small business friendly, just that smaller businesses are much more at risk than it seemed the parent comment acknowledged.

Which I think is an important point because by volume the majority of businesses in the US are small and mid-sized businesses.

fauigerzigerk · on July 22, 2023

>No incentive, right? If you mess up as a CA, you go out of business because every browser drops you.

Not going out of business is not the only possible incentive though. Security is a pretty key part of the sales pitch of every cloud provider: "We can do this better than your in-house team" they keep saying.

No one is going to move 100,000 employees off Microsoft over this, but what about signing up the next 100,000? Will it impact next quarter's growth figures?

And for the people directly responsible for security at Microsoft breaches like these may be a career risk or at least a huge embarrassment. There will be meetings. People will get blamed. Some will feel they haven't met their own professional standards.

There's no lack of incentives.

mistrial9 · on July 22, 2023

you claim that contracts at this level are executed on the basis of technical merit?

fauigerzigerk · on July 22, 2023

I'm not sure I would call it technical merit, but sentiment and reputation do play a role in negotiations. No cloud provider wants to be perceived as having a weakness in terms of security compared to its closest competitors.

There are debates about risks when large companies move to the cloud - in some industries more than in others. Opposing groups within companies as well as competitors' sales people will be looking for weak spots to make their case.

In isolation, incidents like this don't make a difference. If it looks like a pattern that would be bad for business though.

freedude · on July 25, 2023

Does Microsoft have a reputation for good security, adequate security or poor security?

How has this affected sales over the years?

I believe they won the PC wars years ago and there has never been a good replacement system with enough customer incentive that met the same price point. It is this cow they continue to milk.

vladvasiliu · on July 22, 2023

It's not clear to me why there aren't more signing keys. After all, the public part is exposed at an URL you can retrieve whenever you're verifying the token.

I've seen service provider implementations where the access token is exchanged for a cookie or other similar token, in which case you can't really do anything about it anymore for the lifetime of the cookie.

But if you're always verifying the access token with a reasonably short cached key list, the service provider should be able to refuse the token reasonably quickly once the key is no longer advertised.

ThomasRooney · on July 22, 2023

Anecdotally, I ran an experiment with Envoy to see how far the number of signing keys could scale. This was for a B2B “API Key” auth solution; we wanted user keys to be self revocable, but just be a relatively standard JWT format for maintainability. The hypothesis was that, rather than running a whitelist or blacklist, we could improve the security signature by have 1 signing key for each JWT.

When we ran some stress tests, turned out Envoy could happily run with ~300K signing keys in its JWK Set before noticeable service degradation occurred. Even then, by bumping up the memory on the validation servers, there was a small sacrifice of a few ms per extra 100K keys.

This makes me fully agree that, for many applications, there’s probably an opportunity to vastly improve the security surface by bumping up the number of signing keys dramatically.

As long as both Keys and Signing keys define a KID, key verification is prefaced only by a hash table lookup or a tight loop through a keyset to find the appropriate Signing Key, before the slower verification procedure.

vladvasiliu · on July 22, 2023

I guess 640K keys ought to be enough for anybody (TM)(r) (c)

More seriously, though, I wonder how AzureAD is implemented and how hard it would be to scope keys per tenant, if not per application. If I'm not mistaken, SAML certificates are per application.

rvdginste · on July 23, 2023

If you want 1 signing key per JWT, you would need to generate a new key pair for each JWT; wouldn’t that be too expensive? Or was the generation included in your tests?

specialist · on July 22, 2023

ELI5, sorry:

Taking your POC just a bit further and you've got the basis for zero trust networking, right?

That's the Future Perfect Correct Answer™, right?

mistrial9 · on July 22, 2023

zero trust, or zero stability ?

specialist · on July 22, 2023

LMGTFY

https://en.wikipedia.org/wiki/Zero_trust_security_model

DANmode · on July 22, 2023

> why are they not treated with similar security measures as CA keys?

Well, that's an unintentionally loaded question if I've ever seen one.

Xeamek · on July 22, 2023

>People don't seem to know that old saying about not putting all your eggs in one basket anymore.

Probably because by the time this saying was forged, there wasn't this big push from chickens to integrate as many features as possible and improve their functionality and QoL that's only avaible when you do put all the eggs in the same place

Roark66 · on July 22, 2023

Exactly... People just don't learn. For example currently there is a huge number of companies here in Poland that are migrating all of their cloud apps to GCP (Google cloud) from AWS. Why? Because Google has opened a region in Poland.

AWS has live customer support you can ring (if you pay for it) and frequently as a client you get an account manager you can call in case of trouble and that person has direct access to support teams. These support peoole can actually fix stuff for you. Back in the day I've handled lots of support cases like this.

Google on their website claims they also offer live support, but I read enough stories with headlines like "Google deleted my business overnight and there's no one to talk to" to question it's usefulness. I haven't had a chance to deal with Google's GCP support yet so I don't know how good or bad it might be, but I had a couple of support cases raised for other products (play store, book publishing etc) and it was obvious people that work there can't really do anything if stuff breaks. They're there just for Google to be able to say "we too have live support" and to tell you how to do stuff in lieu of documentation. When stuff breaks... You get an email "we're escalating it _to_developers_" to never hear from them again (or you get an email every month asking you if the issue is still ongoing)

So I think it is the biggest case of "putting all your eggs in one basket" I saw in a while. If anyone has contrary experience of GCP support I'd love to hear it.

victor106 · on July 22, 2023

While I was always suspicious of using “Google” for any critical business purposes, I have also learnt that “Google” the search engine is different than “Google Cloud” the internal division within “Google” that runs GCP.

I am yet to see any examples of “Google Cloud” shutting down services on their own whim or not providing a human custom service agents.

QuinnyPig · on July 22, 2023

Google Cloud IoT Core was deprecated last year.

victor106 · on July 23, 2023

Azure is also deprecating their IoT edge

https://azure.microsoft.com/en-us/updates/the-managed-iot-ed...

I think there’s a difference between how Google does things on the search side vs their cloud side.

The regularity with which they deprecate services on the search side is unacceptable.

On the cloud side it doesn’t seem to be anywhere close to that

mnahkies · on July 22, 2023

I've not interacted with GCP support very often, but I have used GCP extensively for the past 3 years coming from an AWS background previously.

All in all it's been a good experience and it's become my favourite cloud. I find the documentation spot on for the most part, and covering things in both approachable language and diving deep where needed.

The IAM system is easy to use, GKE workload identity works great, PubSub works a dream and Big Query is amazing (though pricey!). SDK's are well documented and generally have nice APIs as well.

We've had very little reason to need support, operating a bunch of GKE clusters, VPNs to various partners plenty of databases, buckets, message queues, etc (non trivial setup)

I do have some minor gripes that come to mind:

- Cloud SQL not having a richer API, I'd love to be able to manage postgres permissions by IAM group membership, and grant/revoke postgres roles using the rest API instead of connecting as a postgres user (it'd make secure automation via terraform etc easier if I could lean on IAM)

- VPC peering only allowing "one hop", which necessitates proxies even when using private service connect with two GCP products in some instances (eg: cloud SQL to datastream - why can't we just peer the two Google managed VPCs together? That would also avoid the proxy in our VPC)

- On datastream, why can't I grant a Google managed service account IAM access to postgres instead of configuring a user/pass based user?

- Why can't I configure a longer token expiration for artefact registry? When developing locally I don't want to reauth npm every 30 minutes

- Occasionally missing APIs prevent automation using terraform

So like anything it's not perfect but it also has felt like it's continuously improved overtime so I remain a happy developer

(Edit: formatting)

SoftTalker · on July 22, 2023

Yep my employer is all-in on Office 365, Teams, the whole thing. I have a supply of popcorn ready for when it all comes crashing down. One thing I know is they won't blame themselves.

mschuster91 · on July 22, 2023

> One thing I know is they won't blame themselves.

Which is precisely why they went all in on Microsoft Cloud. Using an in-house stack (no matter if it's Jira, Confluence, Exchange, AD, Postfix, Exim, OpenLDAP, Samba, ...) will always lead to people blaming the C level for any outage, hack, whatever. Miss one tiny little patch and insurance won't pay.

Go for Atlassian Cloud/Microsoft/AWS/GCP? You can now deflect any blame onto the cloud provider. No personal liability, nothing. You followed industry best practices, so insurance pays out.

notimetorelax · on July 22, 2023

I’m with you on that I prefer not to use MS tech. That said, you might be creating an unhealthy dynamic for yourself by expecting it all crash and burn down. Objectively, MS tech still gets the job done.

vladvasiliu · on July 22, 2023

> Objectively, MS tech still gets the job done

Mostly, if you don't have too high expectations.

My employer has had to invest I don't even want to know how much in an EDR (or whatever it's called nowadays) and a slew of services around this circus just to pretend their Windows systems are secure. Didn't catch a home-grown cryptolocker, though, and happily allowed it to encrypt its own files.

They mostly use web-bases SAAS applications anyway, so could trivially replace windows with an OS with a better security track record.

Plus, AzureAD always seemed kludgy. Until recently, they insisted on having SMS or phone as a second factor for password recovery. Which allowed you to reset the stronger 2nd factor used for auth. They only started supporting Fido tokens like a year or so ago for regular 2fa. Their authenticator is a joke: until recently, you had no idea what you were approving. It still doesn't support group inheritance, so if you base it on your local AD as the source of truth, you have to jump through more hoops and add more ad-hoc groups and maintain them. Good times.

mavhc · on July 22, 2023

Microsoft: Pay us for Windows Microsoft: Pay us for online services Also Microsoft: Pay us more if you want those 2 things to be secure

DANmode · on July 22, 2023

If you're talking about Linux, and not BSD or something, then I have a bridge you just may be interested in purchasing.

croes · on July 22, 2023

Doesn't matter if they get the job done if one big mistake can hurt millions of people it a bad idea.

Mono culture is always a bad idea and thinks like MS cloud services are the basis for creating a technical pandemic.

blowski · on July 22, 2023

Ah yes, I long for the halcyon days of IT stability… oh wait, business has always been fraught with risk and trade-offs. You get insurance, put mitigations in place, and hope when it does go bad, you can blame Microsoft anyway.

croes · on July 22, 2023

Ever heard of single point of failure.

Yes, you can hack every company but he have to target them one by one.

Now you only need to take down MS

notimetorelax · on July 22, 2023

I understand your point, but the probability math works against you as the number of platforms grows, unless you have excellent data segregation and access controls. What happens in most cases, each new platform serves as an attack vector to be exploited to gain access to all of the data.

croes · on July 22, 2023

And now you data segregation and access controls are worthless because you need Azure AD for SSO.

The expected value rises.

Obscurity4340 · on July 22, 2023

I, too, long for Halcyon days.

themoonisachees · on July 23, 2023

From the point of view of the employer it's stupid not to put all your eggs in one basket. For $10 a month per account, you get email (unless you have thousands of accounts, this is already worth it), instant messages and a softphone solution, office tools that you needed to buy anyway, and centralized user management for it all. All alternatives involve either still contracting it out to a probably les competent team, or hosting it in house at significant cost. On top of that, the day it all crashes down and burns their business to the ground, insurance pays out because it was cretified and wasn't an in-house solution that the insurer can point at and drag you to court with.

blackoil · on July 22, 2023

Make sure they have long enough expiry.

tinus_hn · on July 22, 2023

So how would you suggest to spread the eggs? Just use another party that wouldn’t have lost their key?

bob1029 · on July 22, 2023

Which is easier to secure? One big chicken coop, or many smaller chicken coops scattered across the ranch?

tinus_hn · on July 22, 2023

So what do you suggest doing to turn it into ‘many smaller chicken coops’ and how is it more secure?

Thiez · on July 22, 2023

Assuming square coops, four small coops have literally double the attack surface compared to one big coop of equal total area. The ratio only gets worse when you divide them up further. The impact of a breach may be less for many small coops, but breaches will happen more often as well.

smegsicle · on July 22, 2023

what's the threat model?

ChatGTP · on July 22, 2023

How can you avoid it though? There are basically 4 tech companies now?

okasaki · on July 22, 2023

Which are the other two?

ChatGTP · on July 22, 2023

Microsoft, Google, Apple Amazon.

You can’t really avoid them.