I - and many others - don’t think this is enough to mitigate supply chain attacks. It’s too easy for a malicious developer to add cryptolocker ransomware in a point release. But I also don’t think having a package in Debian will somehow magically solve this problem either. Debian package maintainers do not - and never have - done security audits on the packages they add. It’s magical thinking to assume using packages in Debian will somehow protect you from supply chain attacks. Asking developers to keep their dependency trees small so we can minimise the risk is a losing battle. Using lots of small, popular dependencies is just too convenient.
There is a solution out there that I wish got more traction. And that is, capability based security. The idea is simple: we partition programs into application code and library code. Library code can never access the filesystem or network (or other protected data) directly. And this is enforced by the compiler or language runtime. Application code can. If you want a library to interact with a file, you first open the file in application code and pass the file handle to the library. Similar wrappers can be made for directories, network sockets and so on. If I want express to listen on a specific port, I pass it that port in a capability object. Not just an integer. (Eg app.listen(net.listenPort(8000))). The amount of code that would need to be changed is tiny since well behaved libraries don’t usually need many capabilities to do their job.
It is ridiculous that my 3rd party json library can invisibly access all the sensitive files on my computer. With capabilities, we take that privilege away. I have no problem trusting a random json library if it literally only has access to the json string I pass it. If we had a capability based security model, I would have no qualms about pulling in hundreds of transitive dependencies. My libraries can still only access exactly what I pass them.
Unfortunately this problem requires language level buy-in. It would be very hard to retrofit rust to work this way and that’ll probably never happen. Raw pointers and inline assembly also make a bit of a mess of things too. (Though pointer provenance might help). But a man can dream… Maybe in rust’s successor.
Yeah I'm really keen for that - though I think compiling a bunch of dependencies into individual wasm modules will be kind of awful to use in practice.
Apparently in firefox they've been compiling some of their dependencies using wasm, then translating the wasm back to C. The end effect of all that is that it gives C code wasm's sandboxing and safety. I wonder what it would look like to do that without needing wasm as a middle step.
You don't need the language to be involved to use this strategy, if you can rely on the OS. Lots of programs are structured this way already: open your secure port as root, then daemonize as an untrusted user.
However, I quite like the idea of using the application/library interface to enforce the restrictions structurally, rather than temporally.
> open your secure port as root, then daemonize as an untrusted user.
That works for server applications. Well - so long as bad things don’t happen in a static initializer before the process drops privileges. But I also want it to work for other kinds of applications. I want to be able to fearlessly depend on a library and not need to worry about that library attacking my computer. Then we can have our cake and eat it too with big dependency trees made up of lots of small libraries written by untrusted contributors.
Another nice thing about this is it doesn’t depend on OS level capability support. Once implemented in a language, it should be pretty easy to make it work on every OS.
Interestingly, this is something wasm provides out of the box. And Firefox’s weird “compile deps with wasm as convert the result back to C” also achieves this end.
Creator of Packj [1] here. How do you envision sandboxing/security policies will be specified? Per-lib policies when you've hundreds of dependencies will become overwhelming. Having built an eBPF-based sandbox [2], I anticipate that accuracy will be another challenge here: too restrictive will block functionality, too permissive defeats the purpose.
1. https://github.com/ossillate-inc/packj flags malicious/risky NPM/PyPI/RubyGems/Rust/Maven/PHP packages by carrying out static+dynamic+metadata analysis.
I’m not imagining something that specifies policies at all. Capability based security systems shouldn’t need them. Instead, the main application can create capabilities programmatically - for example by opening a file for reading. The file handle is a capability. Libraries you call don’t get access to the OS, or anything outside their module scope that hasn’t been passed in as an explicit parameter. A 3rd party library can access the file handle you pass it as a function argument, but it can’t ask the OS to open any other files on your computer. Or open a network socket.
OS APIs aren’t in a scope that any of the libraries you use have access to.
There are a few things capabilities need to be able to do. They need to be able to be tightened. (So for example, if you have a file with RW flags, you can make a copy of the file handle that’s read only). And they need to be unforgeable (file handles can’t be created from an integer file descriptor). We’d also need a handful more “file handle” like objects - and this is where granularity matters. Like we need an equivalent for directories that gives you access to everything recursively in that directory and nothing outside of it.
If the library I call delegates some work to another library, it’s pretty simple. It just passes its file handle through to the other library as a function argument. There’s no policy to maintain or update. Nothing even changes from the point of view of the original caller.
The tricky part is stuff like debug output, environment variables (NODE_ENV=production) or stdout. Do libraries implicitly get access to stdout or not? Maybe they get stderr but not stdout, and they can open new temp files? You would also need to restrict JavaScript code from arbitrarily messing with the global scope or global libraries like String.
But honestly, any answer to these problems would be a massive improvement over what we have now. Being able to know your binary search library can’t make network requests, and that your database can’t interact with files outside its database directory would do so much for supply chain security it’s not funny. A good capability system should just change the APIs that your software can use a bit. Generally in good ways that make dependencies more explicit. Done right, there shouldn’t be any new syntax to learn or deno style flags to pass at the command line or anything like that.
There is a solution out there that I wish got more traction. And that is, capability based security. The idea is simple: we partition programs into application code and library code. Library code can never access the filesystem or network (or other protected data) directly. And this is enforced by the compiler or language runtime. Application code can. If you want a library to interact with a file, you first open the file in application code and pass the file handle to the library. Similar wrappers can be made for directories, network sockets and so on. If I want express to listen on a specific port, I pass it that port in a capability object. Not just an integer. (Eg app.listen(net.listenPort(8000))). The amount of code that would need to be changed is tiny since well behaved libraries don’t usually need many capabilities to do their job.
It is ridiculous that my 3rd party json library can invisibly access all the sensitive files on my computer. With capabilities, we take that privilege away. I have no problem trusting a random json library if it literally only has access to the json string I pass it. If we had a capability based security model, I would have no qualms about pulling in hundreds of transitive dependencies. My libraries can still only access exactly what I pass them.
Unfortunately this problem requires language level buy-in. It would be very hard to retrofit rust to work this way and that’ll probably never happen. Raw pointers and inline assembly also make a bit of a mess of things too. (Though pointer provenance might help). But a man can dream… Maybe in rust’s successor.