> You have to validate your app to make sure it makes sense to allow the use of the new URL scheme anyway
No, you don't, necessarily. A URL is a means of locating a resource; if your app makes sense for the kinds of resources and representations it handles independently of their origin, you don't need to validate anything about a URL scheme.
(The security problem with some file:// URLs actually is a completely different problem, it is not one that there is a question of whether the application makes sense with that scheme -- which it does.)
> Meanwhile the blacklist approach not only exposes you to security vulnerabilities, but imposes a cost every time the underlying platform adds support for a new URL type because now you have to update your blacklist to block it.
No, you only have to update the blacklist if it should be blocked. In many applications. Whether this is a cost that is paid more often than whitelist driven updates depends on whether in the particular application it is more likely that a new URL scheme will be allowed or prohibited.
> No, you don't, necessarily. A URL is a means of locating a resource; if your app makes sense for the kinds of resources and representations it handles independently of their origin, you don't need to validate anything about a URL scheme.
Sure you do. You have to make sure the URL scheme doesn't allow access to data that should otherwise be prohibited. For example, I probably shouldn't be able to pass "ftp://localhost/etc/passwd" to your app. It's not just file:// that has the potential to be problematic.
> Whether this is a cost that is paid more often than whitelist driven updates depends on whether in the particular application it is more likely that a new URL scheme will be allowed or prohibited.
New URL schemes that become widely used on the internet are pretty rare. Usually new URL schemes are restricted to specific narrow use-cases, e.g. magnet: URIs being used for BitTorrent. But there are plenty of niche URL schemes that may or may not be supported by the underlying OS that don't really make sense for you to support (for example, does your markdown converter really want to handle dict: URIs?). The blacklist approach means you need to make sure you know of every single possible URL scheme that may possibly be supported, and evaluate every single one of them to determine if they should be blacklisted. The whitelist approach lets you only allow the schemes that you've determined are safe.
> The blacklist approach means you need to make sure you know of every single possible URL scheme that may possibly be supported, and evaluate every single one of them to determine if they should be blacklisted.
The whitelist approach requires the same thing, it's just that the consequences of getting it wrong are different.
If you don't blacklist something that you should then you could let through a security vulnerability.
If you don't whitelist something that you should then the developers of that software have to devise a way to disguise their software as something that is already whitelisted or be destroyed, which is even worse.
Because doing that is inefficient and complicated, which is the recipe for security vulnerabilities, and then you can't even blacklist it if you know you don't need it because it's specifically designed to parse as something on the whitelist.
You're really stretching here. If your markdown converter only accepts http and https, so what? That's all it was ever tested with, there's no reason to expect it to support some other niche URL scheme. In fact, in this entire discussion, I have yet to even think of another URL scheme that you would expect to be widely-supported by tools like this. With the whitelist approach, you don't need to consider all of the various URL schemes, you just need to say "is there anything besides http and https that I should support?", to which the easy answer is "probably not".
It seems you're answering your own question. Why are there no other popular URL schemes? Because too many things don't support generic schemes so any new ones are DOA.
Here's an example. Suppose I want to do content-addressible storage. I could create a new URI scheme like hash://[content hash] and then make some client software to register that scheme with the OS, and in theory lots of applications using the operating system's URI fetch API could seamlessly pick up support for that URI scheme. But not if too many applications do the thing you recommend.
So instead I write software to use http://127.1.0.1/[content hash] and then run a webserver on 127.1.0.1 that will fetch the data using the content hash and return it via HTTP. But then we're +1 entire webserver full of attack surface.
No, you don't, necessarily. A URL is a means of locating a resource; if your app makes sense for the kinds of resources and representations it handles independently of their origin, you don't need to validate anything about a URL scheme.
(The security problem with some file:// URLs actually is a completely different problem, it is not one that there is a question of whether the application makes sense with that scheme -- which it does.)
> Meanwhile the blacklist approach not only exposes you to security vulnerabilities, but imposes a cost every time the underlying platform adds support for a new URL type because now you have to update your blacklist to block it.
No, you only have to update the blacklist if it should be blocked. In many applications. Whether this is a cost that is paid more often than whitelist driven updates depends on whether in the particular application it is more likely that a new URL scheme will be allowed or prohibited.