As a practical question: what is the expected capacity of the preload stores of browsers? Hundreds of thousands, millions or much more domains? Because at some point it seems like everyone with moderately high security requirements may want to have their certificates pinned / preloaded.
I think that's an open question. Right now, it's not the millions, that'd be too much to bundle with browsers. But browsers may well change their delivery mechanism for preload information to allow this to scale higher.
In any case, .gov won't add much to the load -- right now there are all of 5,500 .gov domains, and the rate of adding/removal is on the order of dozens every month at most.
Is that just domains from which web content is hosted, or just second level domains regardless of whether web content is hosted? Because I can't imagine that there are only 5,500 total .gov domains.
My browser's disk footprint is already over 100MB. What's another X MB?
The ironic thing is we're reinventing the CA system, only now each browser is its own authority, exactly the problem the CA system was trying to solve.
I wouldn't say this is reinventing the CA system. You don't need to trust any particular browser here. The effect is that web services must offer a secure HTTPS connection, using the existing CA system (or an enterprise CA, if their user base is truly all-enterprise), no matter what browser is being used.
> You don't need to trust any particular browser here.
You do need to trust that the particular browser you are using supports preloaded list, and is using the latest updated version of the list, and is not missing any entries!
What you're trusting the browser for there is the extra protection that preloading provides, but that's not the whole benefit here. The larger benefit is that it makes it infeasible for services to neglect to support HTTPS. So, even if your browser's preload list is busted, the site will be guaranteed to support HTTPS because of this effort, which you'll still benefit from.
Ah ok. I think I see what you are saying now. As long as a sizeable portion of browsers support a fresh version of the preloaded list, there is sufficient customer feedback to push the servers to only support https. Right?
Seems like you could use a bloom filter to store whether a domain has a pinned cert, and then use an api provided by the browser to remotely fetch the pinned cert. This has privacy implications, but does step around the storage. Chrome does something similar for CRL, but bloom filters fit that use case better.
IIRC DuckDuckGo might do something like that for search suggestions. For slightly improved privacy, lots of unrelated hosts can be grouped into blocks (maybe grouped by probability of access to make it harder to infer which domain from the block is the likely target).