Hacker News new | past | comments | ask | show | jobs | submit login
How we served 20k IPython notebooks for Nature readers (rackspace.com)
95 points by e12e on Oct 9, 2015 | hide | past | favorite | 6 comments



That's fantastic, but are there security concerns?

I was thinking about setting up a similar IPython-as-a-service thing, but 5 minutes on Google suggested that it would be difficult to do it securely.

This isn't just limited to data security, but also DDoS concerns. I.e. making sure that each user doesn't take up too much CPU/RAM/HDD.

On second though, I'm sure its possible, but it certainly wasn't trivial.


It might be better to look at rkt with a kvm backend if you want/need real isolation and resource limits. Maybe docker will provide isolation/security at some point, but the core differentiating thing about Docker vs OpenVZ etc is ease of use -- explicitly forgoing much of the security and isolation possible in the underlying technology.

From what I've seen, trying to bolt security and isolation back onto Docker is generally a bad idea.

That said, when people have done the work to make microservices deployable under Docker, they have in general laid the groundwork needed to run these services in a truly isolated environment. So in that respect Docker is great; the effort to run something that works in a minimal Docker container in a jail/chroot/vm should be much less than starting from scratch; a lot of the assumptions about resources/dependencies should already be fixed.


Well, you are piping remote code directly into a python kernel,[0] so the user can always run arbitrary code with local rights. It is possible to limit the local rights far enough to prevent anybody from doing something bad, so use cgroups to limit network bandwidth, file io and memory and cpu usage. (So that you can prevent users from DDoS your server or attack others through the IPython service.)

Second, defense in depth, you want to have some defense in place even in the presence of an exploit and the usual first line of defense would normally be, don't run arbitrary code. So that line is broken by design for IPython-a-a-S, the user can just copy the latest exploit into the IPython notebook. This suggests virtualization, so that you have one VM per user and the user needs to break out of the virtualization and out of the cgroups jail mentioned above, before he can do anything interesting. (It has also the benefit, that user management is somewhat simplified.)

So overall, I guess it can be done, but needs a focus on security.

[0] this suggests the notion of an interpreter breakout. Yes this is a joke.


Docker has basic cpu and memory limits available, and they are used in tmpnb:

https://github.com/jupyter/tmpnb/blob/44ff714e6e65897547dac5...

I'm pretty sure you can also tweak IO scheduling with cgroups.


SageMathCloud (https://cloud.sagemath.com), which was mentioned in the first line of the rackspace post, had 850 concurrent users at some point yesterday. Our typical load is that high now. I don't think we've exceeded 1000 simultaneous users, but will likely do so in the next week or two.


There's a weird bug - when I click it to go fullscreen it does for a second, but then it reverts to just the left side of the screen.

[EDIT] - screenshot of bug: http://imgur.com/kXoMtvM [/EDIT]




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: