> Personally I'd much rather extra security to enable untrusted containers access to the hosts fs is implemented in the container runtime, not as a separate component. Or if the "security issues" it addresses perhaps even in the hosts operating system?
Isn’t that exactly what the original gofer/RPC solution is? The gvisor container runtime operates in userland to ensure that compromises in the runtime don’t result in an immediate compromise of the system kernel.
But running in userland and intercepting syscalls that do IO always has significant performance implications, because all your IO now needs multiple copy operations to get into the reading process address space, because userland process generally can’t directly interact with each other address space (to ensure process isolation), without asking the kernel to come in to do all the heavy lifting.
So if you want fast local IO, you have find a way to allowing the untrusted processes in the container to make direct syscalls, so that you can avoid all the additional high latency hops in userland, and let the kernel directly copy data into the target processes address space.
To magically allow the container runtime to provide direct host fs access itself, with native level performance, that would require the runtime to be operating as part of the kernel. Which is exactly how normal containers work, comes with a whole load of security risks, and is ultimately the reason gvisor exists.
Isn’t that exactly what the original gofer/RPC solution is? The gvisor container runtime operates in userland to ensure that compromises in the runtime don’t result in an immediate compromise of the system kernel.
But running in userland and intercepting syscalls that do IO always has significant performance implications, because all your IO now needs multiple copy operations to get into the reading process address space, because userland process generally can’t directly interact with each other address space (to ensure process isolation), without asking the kernel to come in to do all the heavy lifting.
So if you want fast local IO, you have find a way to allowing the untrusted processes in the container to make direct syscalls, so that you can avoid all the additional high latency hops in userland, and let the kernel directly copy data into the target processes address space.
To magically allow the container runtime to provide direct host fs access itself, with native level performance, that would require the runtime to be operating as part of the kernel. Which is exactly how normal containers work, comes with a whole load of security risks, and is ultimately the reason gvisor exists.