I - and many others - don’t think this is enough to mitigate supply chain attack...

dinosaurdynasty · on Nov 14, 2023

This is similar to how WASI (standard system interfaces) for WASM work.

josephg · on Nov 15, 2023

Yeah I'm really keen for that - though I think compiling a bunch of dependencies into individual wasm modules will be kind of awful to use in practice.

Apparently in firefox they've been compiling some of their dependencies using wasm, then translating the wasm back to C. The end effect of all that is that it gives C code wasm's sandboxing and safety. I wonder what it would look like to do that without needing wasm as a middle step.

couchand · on Nov 14, 2023

You don't need the language to be involved to use this strategy, if you can rely on the OS. Lots of programs are structured this way already: open your secure port as root, then daemonize as an untrusted user.

However, I quite like the idea of using the application/library interface to enforce the restrictions structurally, rather than temporally.

josephg · on Nov 14, 2023

> open your secure port as root, then daemonize as an untrusted user.

That works for server applications. Well - so long as bad things don’t happen in a static initializer before the process drops privileges. But I also want it to work for other kinds of applications. I want to be able to fearlessly depend on a library and not need to worry about that library attacking my computer. Then we can have our cake and eat it too with big dependency trees made up of lots of small libraries written by untrusted contributors.

Another nice thing about this is it doesn’t depend on OS level capability support. Once implemented in a language, it should be pretty easy to make it work on every OS.

Interestingly, this is something wasm provides out of the box. And Firefox’s weird “compile deps with wasm as convert the result back to C” also achieves this end.

ashishbijlani · on Nov 15, 2023

Creator of Packj [1] here. How do you envision sandboxing/security policies will be specified? Per-lib policies when you've hundreds of dependencies will become overwhelming. Having built an eBPF-based sandbox [2], I anticipate that accuracy will be another challenge here: too restrictive will block functionality, too permissive defeats the purpose.

1. https://github.com/ossillate-inc/packj flags malicious/risky NPM/PyPI/RubyGems/Rust/Maven/PHP packages by carrying out static+dynamic+metadata analysis.

2. Sandboxing file system w/o superuser privileges: https://github.com/sandfs/sandfs.github.io

josephg · on Nov 15, 2023

Cool library!

I’m not imagining something that specifies policies at all. Capability based security systems shouldn’t need them. Instead, the main application can create capabilities programmatically - for example by opening a file for reading. The file handle is a capability. Libraries you call don’t get access to the OS, or anything outside their module scope that hasn’t been passed in as an explicit parameter. A 3rd party library can access the file handle you pass it as a function argument, but it can’t ask the OS to open any other files on your computer. Or open a network socket.

OS APIs aren’t in a scope that any of the libraries you use have access to.

There are a few things capabilities need to be able to do. They need to be able to be tightened. (So for example, if you have a file with RW flags, you can make a copy of the file handle that’s read only). And they need to be unforgeable (file handles can’t be created from an integer file descriptor). We’d also need a handful more “file handle” like objects - and this is where granularity matters. Like we need an equivalent for directories that gives you access to everything recursively in that directory and nothing outside of it.

If the library I call delegates some work to another library, it’s pretty simple. It just passes its file handle through to the other library as a function argument. There’s no policy to maintain or update. Nothing even changes from the point of view of the original caller.

The tricky part is stuff like debug output, environment variables (NODE_ENV=production) or stdout. Do libraries implicitly get access to stdout or not? Maybe they get stderr but not stdout, and they can open new temp files? You would also need to restrict JavaScript code from arbitrarily messing with the global scope or global libraries like String.

But honestly, any answer to these problems would be a massive improvement over what we have now. Being able to know your binary search library can’t make network requests, and that your database can’t interact with files outside its database directory would do so much for supply chain security it’s not funny. A good capability system should just change the APIs that your software can use a bit. Generally in good ways that make dependencies more explicit. Done right, there shouldn’t be any new syntax to learn or deno style flags to pass at the command line or anything like that.