It all depends on what your goal is. Allowing to run VLC as a bundle with more or less full access is certainly possible. However, that is not really "sandboxed" in any fashion.
For instance, once any client has any kind of X11 access they can snope any kind of keyboard events, including your password at the unlock screen.
If we want to support the full functionallity for a media player in a sandboxed way we need to start looking into each requirement and designing a safe way to access each item. This is gonna be a lot of work, but I don't see any way around that.
For your exact list:
Files access can either be granted to the app fully or partially. But we also want some kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
Raw device access will not happen by just having the app open the raw device nodes. Instead we'll have some kind of service in the session that (via user interaction or "remembered" grants from the user) virtualizes access to these things. This could be all from e.g. passing a file descriptor of the opened dvd device to a complete replacement of the subsystem. For an example of the later, for webcams see the pulse-video project: https://github.com/wmanley/pulsevideo
Is raw audio output necessary? Why does not pulseaudio work?
OpenGL access is supported
Network access is (optionally, but i think this will be on for most apps) allowed
Well, we've seen sandboxes on other desktop platforms (OSX, WinRT, ChromeOS), and so far, they all are horribly limiting. So, I'm making sure the same does not happen on Linux, before we get kicked out of our own platform.
> kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
This is not enough, as explained above.
> Raw device access will not happen by just having the app open the raw device nodes. Instead we'll have some kind of service in the session that (via user interaction or "remembered" grants from the user) virtualizes access to these things.
Well, this is a deal breaker so far. Are you going to do a pulsedvd, a pulsecd, a pulsedvb, a pulsesdi for all the access modules? Playing encrypted DVD requires direct access, as far as I know.
Something using GStreamer in Vala to get indirect access to webcams? I don't see how this could even work: how do you control brightness or other webcam controls from two applications, how do you get direct H26x access with preview synchronized? (And asking someone to use a competitive project to get video input is also a bit rude, but that's beside the subject)
> Why does not pulseaudio work?
libpulse requires X, as far as I know.
> OpenGL access is supported
Through Wayland? How do I get YUV surfaces? What about overlay? What about VDPAU/VAAPI? I guess this will get a lot of improvement, because wayland, wl_scaler (et al) are very limited, so far.
We'll see how it fares, but from the past interactions, the answers were pretty dismissive; and therefore, I'm not that optimistic about the outcome.
>> kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
>This is not enough, as explained above.
Well, its hard to say that as this kind of stuff is not designed yet. Better to say that the requirements of VLC on the design of this are more complex than just the "allow access to single-picked-file" semantics. Hopefully we can take this into account when we look into the details here.
> Well, this is a deal breaker so far. Are you going to do a pulsedvd, a pulsecd, a pulsedvb, a pulsesdi for all the access modules? Playing encrypted DVD requires direct access, as far as I know.
Some of these need not necessarily be as complex as you imagine though. They could very well be some kind of thin service that enumerates the real devices and hands over the opened FD to your app after having verified that it is allowed.
> libpulse requires X, as far as I know.
Libpulse can optionally be built with X, but this is really only used to read back the pulse socket name from the x root property. It works just fine without X (witness the non-X, but pulseaudio using demo in the blog post for instance).
OpenGL access is via DRI, but yeah, output to the screen will have to happen via wayland. That could use subsurfaces with YUV surfaces, and it is then up to the compositor to use overlays if possible (i know weston does this for instance).
Maybe I'm missing it but I don't see how DRI is really sandboxing OpenGL.
Note: I helped write the OpenGL sandbox in Chrome and it's certainly possible I don't know the details of DRI enough to understand how it sandboxes but I do know what we had to do to sandbox OpenGL in Chrome, that includes re-writing shaders since we can't trust the driver, clearing buffers since OpenGL doesn't and so you can read other processes' data, and many other things and as far as I can tell DRI isn't doing things like that.
At the moment we grant all of /dev/dri, but long term we want only the render nodes accessible. That gives us no modesetting or DRM master capabilities, only rendering.
It is true though that the drivers could very well have leaks in them, but the userspace sandbox is not the place to fix that, it is the drivers themselves. The intent is for the dri driver APIs to give guarantees about client separation, but I'm sure it needs work.
The drivers will never be fixed. Chrome doesn't trust them. It's not in their interest to fix them since sales are determined by performance not by security.
I'm pretty sure DRI doesn't do anything special to the OpenGL calls, so they just get punted into the scary binary client library from your GPU vendor.
> Hopefully we can take this into account when we look into the details here.
This is all I'm asking, and not the usual "I don't care about your usecase, you should just use GStreamer" that we get from the Gnome/RedHat/Systemd team.
Yeah, the possibility of a "gatekeeper effect" is what's worrisome here. There's no way you can anticipate everything a user might want to do with their computer, so while opt-in sandboxing can only help, mandatory sandboxing could stifle the creativity of developers who must go through a gatekeeper (even a benevolent one) any time they want to ship a program that uses hardware or software abstractions in some unforeseen way.
Or alternatively, Joe User goes "[grumble grumble] guess I gotta delete that dang PulsePolicyD that's keepin' me from runnin' muh binaries [copies and pastes the first command found when googling "how i run X" into shell]" and it's all for naught.
> I'm making sure the same does not happen on Linux, before we get kicked out of our own platform.
To make one thing very clear:
Sandboxing is partly intended to be an additional way to distribute software. Majority of the packages come from your distribution. This sandboxing (together with other bits) would allow you to distribute your application eventually 80% of the various Linux distribution users. I think it'll take some time because there's a lot of things to work out (e.g. relying on Wayland, kdbus, change Pulseaudio to use kdbus, etc).
A user should have control on what an application can do, but that's something new and user might highly prefer such applications (e.g. not allowing an application to just modify ~/.bashrc and adding some evil sudo alias if it feels like it).
IMO at most you'd have some package software which shows for VLC: "unrestricted access". The intention is to offer more security where possible to the user while at the same time it's not too difficult for the developer.
> To make one thing very clear: Sandboxing is partly intended to be an additional way to distribute software. Majority of the packages come from your distribution.
This is not what we understand from your GnomeOS talks and systemd "distributions are obsolete" posts.
>> kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
> This is not enough, as explained above.
If the file chooser is powerful enough, couldn't it be?
I don't use VLC playlists at all, but I could e.g. grant read-only ~/music/.../*.mp3 (edit: apparently I can't double star) access, and that'd be enough for my music playlists as far as I'm concerned.
> Experience from OS X and WinRT is that they aren't powerful.
And I'm not suggesting that the current state of things is sufficient.
> What about mkv linked files or .mpc lossless complements? Or cue/bin complements?
> What about m3u that have mp3, ogg and flac interleaved?
In my case? As I don't use these formats, these are signs of bad things are happening (fishing for file format specific overflows in the parser?), and it is that simple.
Presumably, if I did use those formats, the ability to grant ~/music/.../.{mp3,ogg,flac} or ~/music/ is not a huge step from granting access to *.mp3. If it is, your codebase is suffering from some serious bitrot... I sympathize, but whitelisting isn't the problem at that point.
We already deal with some of this, with dialogs prompting us about which file types we want to associate with our media players. Reuse and retool those, profit?
> In my case? As I don't use these formats, these are signs of bad things are happening (fishing for file format specific overflows in the parser?), and it is that simple.
You don't use mkv or playlists with mixed content? You don't use subtitles?
> the ability to grant ~/music/.../.{mp3,ogg,flac} [...] If it is, your codebase is suffering from some serious bitrot... I sympathize, but whitelisting isn't the problem at that point.
Your first idea is that our codebase is bitrot? Seriously?
> which file types we want to associate with our media players. Reuse and retool those, profit?
And you know, it's been a long time we don't rely on extensions for format detection.
Sorry, but your comment is very arrogant (limit insulting), and you speak of things you don't understand.
> You don't use mkv or playlists with mixed content? You don't use subtitles?
No MKV, mp3-only playlists, and then only subtitles I use are those embedded within the .avi / .mp4. I'll certainly admit I'm not VLC power-user.
> Your first idea is that our codebase is bitrot? Seriously?
You've chopped far too much context. I'm assuming your codebase supports neither whitelist in it's current form - reasonable, nothing needs it yet. That is my first idea.
I'm also meaning to imply that you're overstating how difficult it would be, if your codebase did support interacting with the standbox to request .mp3 whitelisting, to request other formats as well. That is my second idea.
My third* idea, at best, is that your codebase has bitrot. This is not me intending to slight you, your project, your contributors, your former contributors, or any of the choices involved. This is not me saying I am better than any of that. This is me, where half of my last job was dealing with bitrot, some caused by my own hand - trying to relate, while trying to identify where such a hypothetical, assumed to be serious problem, would actually lay.
> And you know, it's been a long time we don't rely on extensions for format detection.
You might not, but windows still does. Is this dialog gone?
> Sorry, but your comment is very arrogant (limit insulting), and you speak of things you don't understand.
I apologize if it came off that way, and for touching a nerve. But please understand me. The sin you're looking for is Envy, not Arrogance, within me if you've had that leisure - of dealing only with pristine codebases without serious problems.
> Is raw audio output necessary? Why does not pulseaudio work?
I don't know if this applies to VLC, too, but raw audio output (or something that isn't PulseAudio) is necessary wherever low (or at least constant) latency is required.
Edit: oh - now that I look at your username, I believe congratulations are in order re. the subject of this thread :)?
That link seems to encourage raising the latency, which suggests the needs of people like musicians are not being considered at all. Your use-cases are not the only use cases that exist.
The entire world of music production uses JACK, which got the design right well before PulseAudio even existed. It can provide synchronous operation to prevent drifting between audio tools. I include mplayer2 as one of those tools for s few things. More importantly, JACK guarantees a reliable very, very low latency audio path.
How "low latency" do I mean? Consider that in printed music, while somewhat uncommon, quite a few pieces have been written that use 128th notes. That would be anywhere from ~80ms to ~10ms note lengths, depending on the tempo. Larger latencies mean notes being heard as a different note from what was played, and I keep JACK down to about 5ms latency, and would set it lower if I could afford new hardware. Notes are not the only feature of musinc, and some headroom is needed. Any latencies higher than about 10ms-20ms I would consider broken and unusable.
(Incidentally - I cannot play GuitarHero/RockBand on most TVs either because of horrible latench. A (60Hz) video, a two frame delay is a "miss" on the harder songs. those 16.6ms latencies are nasty regardless which direction the latency is being added)
Beyond that, JACK is a great API. It makes it trivial to do a LOT of the common needs (i.e. "Just give me the audio samples..." or "just let me write new audio data and not have to worry about anything else... such as timing")
Raising the latency is what you want in many cases. For instance, the current discussion is about a video player. There is no need for a video player to submit new data every 10ms, because it knows exactly what it will be playing for the next N seconds (depending on the buffer size it uses). Waking up every 10ms just causes more cpu and power use and risks missing the deadline.
Of course, there are cases where you do need lower latencies, such as games where interactive feedback cause the sound to change, or when you pause a movie. In those cases you adjust the latency (possibly temporarily in the case of the pause).
That said, PulseAudio has a different set of requirements and thus design decisions from Jack. It is not meant to be used in music production, but then again, I doubt the music producers will use some sandboxed app to do music production either. So, I don't see the problem here.
I think VLC also does recording, but I'm certainly in no position to comment on it. I'm not familiar with their codebase and I don't want to lie :).
I don't know what PulseAudio considers "very low latency" and there aren't any numbers on that page to indicate it, but in my experience, PA's best latency possibilities are about an order of magnitude away from being useful for any kind of hardcore real-time processing. This isn't a problem as far as PA is concerned since it's not what it was conceived for, but it won't cut it for applications which need that kind of capability.
PA also doesn't really provide latency guarantees, either. In fact, it's mentioned on that page:
> So in summary: tell PA what latency you want, but program defensively so that you can deal with getting both lower or higher measured latencies.
I don't know if that has improved though. The last time I had to poke PulseAudio was a few months before that page's "Last Edited" timestamp.
Edit: For what it's worth: I never managed to get PA to consistently keep latency below 25-30 ms, which is at least 10-15 ms away from being adequate for real-time audio processing. I guess it is a cornercase though, and probably outside the Gnome project's interests. Reading the other comments in this thread, I assume VLC's problems with PA aren't related to latency.
Just to clarify: No one cares if a video player has low-latency audio as long as the audio latency matches the video latency. As in, it totally sucks when the words coming out of an actor's mouth don't line up with the audio.
The only place where low-latency matters is real-time encoding situations. For example, video conferencing, composing music, streaming things live, etc but even in those situations "low" is a relative term. Low-latency video conferencing is anything under 200-500ms (depending on who you ask). Low-latency music composition is <20ms (again, depending on who you ask).
> For instance, once any client has any kind of X11 access they can snope any kind of keyboard events, including your password at the unlock screen
Is this true? Can you provide a proof of concept that can attack xscreensaver in this way? I was not under the impression that xscreensaver in particular is vulnerable to such an attack. I would like to see the code.
> Is raw audio output necessary? Why does not pulseaudio work?
Funny story, I have never gotten pulseaudio to successfully play any audio. Every so often I read somewhere that pulse makes audio scenario xyz really easy. I try it out and I cannot get any audio samples out to the sound card, period. The only other time I hear about pulse is when I'm telling people to kill it, which ends up fixing all their audio problems. So no thanks to pulse.
Meanwhile I can't help but think that Unix already has a security model for talking to devices, it is called enforcing security at open(2). What is bad about letting an app talk to alsa if it needs audio? If you can provide an example, is that not a privilege escalation in alsa that should be fixed?
It just strikes me that your article here is a technical solution in search of a problem. Apple did sandboxing so it must be right, huh?
> What is bad about letting an app talk to alsa if it needs audio?
In general, audio output is not really a problem. However, input is much more problematic. For instance, your app could listen on your microphones and send the data over the network. Permissions on the audio device doesn't work here, especially considering that such permissions are per-user, not per-app.
Overall the security model in unix is pretty shit for the desktop. Its all about protecting root or other users from the user. However, if I'm on a single user laptop that is not overly interesting. The much more interesting part is protecting the user from the system. For instance by being able to run a game without it ever having the possibility of reading my personal email or web history.
For instance, once any client has any kind of X11 access they can snope any kind of keyboard events, including your password at the unlock screen.
If we want to support the full functionallity for a media player in a sandboxed way we need to start looking into each requirement and designing a safe way to access each item. This is gonna be a lot of work, but I don't see any way around that.
For your exact list:
Files access can either be granted to the app fully or partially. But we also want some kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
Raw device access will not happen by just having the app open the raw device nodes. Instead we'll have some kind of service in the session that (via user interaction or "remembered" grants from the user) virtualizes access to these things. This could be all from e.g. passing a file descriptor of the opened dvd device to a complete replacement of the subsystem. For an example of the later, for webcams see the pulse-video project: https://github.com/wmanley/pulsevideo
Is raw audio output necessary? Why does not pulseaudio work?
OpenGL access is supported
Network access is (optionally, but i think this will be on for most apps) allowed