Also it seems like people are leaning towards replacing WHOIS with DNS, which seems like a great fix for their operational problems, but probably a security mistake.
Of course there are manual processes - there's a huge market for people who legitimately need certificates but cannot/will not automate everything. Otherwise everyone would have left the business after Let's Encrypt entered. You're not paying DigiCert for the cert itself - LE can give you that - you're paying them for having customer service, aka manual processes.
It looks like this particular process exists because automating WHOIS lookups is hard/unreliable. An agent does a WHOIS lookup, documents it, and passes the result to other people to review. That genuinely seems to be the best way to deal with poor-quality databases that have the right data.
Also it seems pretty natural to me that some business is going to ask for a certificate for www-staging.example.com or something, prove ownership in a way that happens to demonstrate ownership of example.com, and then be mad on launch day when www.example.com doesn't have a certificate too. So there should be a way for an agent to say "It looks like they control example.com" and authorize that, as long as there's review in place.
We should avoid having human agents making these determinations because they're very bad at it. Play to human strengths. The human agent working around the fact that you keep saying "Apache" but you're actually clearly running IIS from your description is where humans are good, training an AI system to figure this out would be really hard. Figuring out that .in-addr.arpa is a Public Suffix and therefore you cannot possibly prove control over it from a Web PKI point of view is just trivial string matching and perfectly suited to a machine.
We already have had incidents where a CA mistakenly took proof of control of one-name.example.com as sufficient proof of control of example.com itself and it's part of why Facebook freaked out before we had mandatory CAA (and before Facebook wrote themselves CAA records) because a sub-contractor got themselves a certificate Facebook didn't intend to exist. So no more of that please.
You should be allowed to register a public suffix - github.io is a public suffix, but the *.github.io wildcard is a perfectly legitimate certificate. And conversely facebook.com is not a public suffix, so that rule wouldn't have helped Facebook.
I agree a public suffix should trigger more aggressive validation (as should, perhaps, the first request for a TLD that the CA has never issued for before), but I don't think a computerized rule blocking registration of public suffixes is appropriate.
The Baseline Requirements forbid issuing for Public Suffixes in what the PSL calls the "ICANN DOMAINS" section (or wildcards immediately under those suffixes). That section includes in-addr.arpa but not github.io
Now, the BRs don't specifically tell you not to validate in-addr.arpa and then issue more specific certificates, but my patience wears very thin on how much needs to be spelled out letter by letter to have CAs stop doing things that are obviously a terrible idea.
I can see how the juxtaposition was confusing. Facebook wasn't concerned about public suffix problems, they were worried by (a historically common) pattern where bad guys find that if they're able to persuade a CA to issue them with some-machine.example.com the same CA will then issue www.example.com to them because hey, they're from example.com... It so happened in the case that worried them that couldn't have happened, but they were not to know that.
Oh, I forgot about the two classes of public suffix - and yes, if the BRs categorically forbid it and it's on an intentionally-machine-parseable list, then I agree there's no excuse for allowing issuance before a software upgrade to explicitly carve out an exception.
I mean, at the end of the day it has to be, right? In a world where everyone runs ACME, aren't there still human operators who can deploy bug fixes when Google says "Yo, why isn't our cert renewing"? Or is the plan that there's no recourse to humans, and we're all watched over by robot CAs of loving grace?
It depends on what you care about more: certs being issued validly, or certs being issued? Processes can be put in place so that humans can fix problems when they find them, but also prevent a human from doing the wrong thing. Autonomation instead of automation. https://en.wikipedia.org/wiki/Autonomation
I think it's reasonable to care more about certs being issued. Most large companies are incompetent at IT (and the reasons for this are varied and interesting but not immediately solvable), yet if they can't reliably get certificates, the plan to encrypt all web traffic falls apart. Meanwhile, Certificate Transparency monitoring mitigates the impact of mis-issuance. (Which is not to say that mis-issuance isn't great, just that it's fine to target a system that occasionally produces mis-issuances.)
I guess in part I feel this way because I see HTTPS as baseline security for the web, not strong security. It's like SSH - the fact that telnet is almost completely gone is a win, even though few people really check the initial fingerprints the way they "should." In the case of both HTTPS and SSH, you can build a strong security system on top of it if you want (HPKP, pinned certs in apps, etc. for HTTPS, and distributing known_hosts files or using a private PKI or Kerberos for SSH), but the primary value is that it works reliably to the point where falling back to plaintext is unusual.
HTTPS is a good enough baseline if all you care about is doing business. If you're an individual trying to protect yourself online, it's not a good baseline.
SSH is a good example of this. You can totally MITM brand new SSH connections all day, because nobody ever checks the fingerprint (and virtually all server-side automation just ignores them completely). The only attacker we actually care about with SSH is a really stupid one who thinks that trying to MITM every session at random will actually succeed. (It won't.) If the attacker is smart, they'll use targeted attacks in ways that they are more likely to find a first-try connection, and MITM that. So, why do we use SSH? Because it mostly defeats the dumbest possible attacks.
With HTTPS, instead of the attacker targeting a user, the attacker [in the future] targets cert creation. If they succeed, they get full MITM; if they fail, they lose nothing. And they get to keep trying, on every single damn CA, and on any target site they choose. So, why do we even use HTTPS? Because it solidly defeats the dumbest possible attacks.
Neither of these are good baselines, but with HTTPS it's even worse, because the user has no way of defending themselves. At least with SSH I can enforce my own known_hosts file. With HTTPS, I am at the mercy of the electronic gods that maintain my browser and generate certs.
And (importantly, I think) if someone compromises my SSH connection, at worst they get to control my server. Oh noooo! There goes my private git repo! Whereas if they MITM my HTTPS connection, they get my retirement fund.
HTTPS is like a wall that surrounds the village, built by the tribal leaders. We can't touch it, except to climb the rungs of ladders built into the wall. It will keep out the wolves, but not enemy soldiers. For now, it's good enough at keeping the wolves at bay, so we still use it. But one day soon the wolves will learn to climb ladders. https://www.youtube.com/watch?v=G_cTl4wR81Q
>> even though few people really check the initial fingerprints the way they "should."
I am yet to see a single hosting provider which doesn’t only allow you to set up your own authorizarion key in a new instance, but which also immediately gives you the generated key of the instance that you can put in the .ssh/known_hosts.
AWS (and more generally anything running cloud-init) does this fairly well - it prints it to the console log. I use this, and I have on a couple of occasions booted a new instance and mounted the old one's disk when my known_hosts file got trashed by mistake, so I could re-establish a trust path.
Of course you have to know to go there, wait for the console log to sync (for some reason AWS takes a few minutes ... I wonder if it's backed by some eventual-consistency object store), and scroll down in the log.
Come to think of it, it would be real cool if OpenSSH had a -o Fingerprint=ab:cd:ef... option you could pass, so that they could give you a single command to copy and paste to your terminal.
Well, this cert is OV (IV?) so it's always going to be at least semi-manual because we know the structures to truly automate OV issuance are flaky at best. The fact a DigiCert person looked at this application is unsurprising.
The problem starts when that person gets to do crazy things like pick a Public Suffix as the thing being validated and gets worse when the "proof of control" is only barely related to the thing claimed. I await more from DigiCert on what was supposed to be going on here and how they'll improve.
I don't buy the theory that DNS is worse than WHOIS. Even the WHOIS proponents generally try to fudge this by trying to compare RDAP, because the security considerations for WHOIS come down to "cross your fingers".
In theory RDAP becomes mandatory (for the gTLDs) later this year. In practice we know the pilot programme was not exactly a roaring success and the registries have decided they'd rather the entire mechanism went away.
But we'll see, I care mostly about Prevention of Future Harm, if we prevent Future Harm by going to RDAP successfully that's fine for me too.