>Support for cgroup namespaces
This release adds support for cgroup namespaces, which provides a mechanism to virtualize the view of the /proc/$PID/cgroup file and cgroup mounts.
What are the advantages and practical applications of this ?
It also allows you to mount the cgroupfs in an unprivileged user namespace (useful for rootless containers, something I'm working on in runC).
It's not entirely useful at the moment to be honest, since all it does is mask /proc/self/cgroup. But it could be extended later to also virtualise /proc/meminfo and /proc/cpuinfo to reflect the cgroup limits.
I'm currently trying to get a patch merged into cgroup core to allow for processes in a cgroup namespace to manage their own cgroups (without compromising the hierarchy limits). Hopefully it gets into 4.7. This will be exceptionally useful for cgroup hierarchies.
Boy, I hope the memory constraints for cgroups are getting better. I'm stuck on an ancient platform where cgroups are relatively new-ish (SuSE Linux Enterprise 11). We hit a huge anti-feature where fs cache eviction from a cgroup-memory-constrained process happened in the foreground and caused astonishing latency. Sounds like it was the intended design. Hopefully it's evolving to no longer be limited like that.
I'm not sure what you're referring to as an "anti-feature". By design, if a process requests memory when it's already at its limit, the kernel will wait until more memory is available -- which could require a writeback, if the contents of the pagecache are dirty.
Would you have preferred for the kernel to instead allow the process to exceed its memory limit, even if that means evicting something else outside the container? If so, why not just set the limit higher?
Sorry if some of this is already clear but I'll try and back up a bit and give more context.
Linux uses greedy caching for filesystems and it's usually pretty awesome. But by merely opting in to be in a memory-constrained cgroup, it is (was?) no longer able to reap those pages in the background. Perhaps it's because the kernel daemons that would ordinarily do that don't have the right context. Or in order to have it they'd need to have one instance for each cgroup dir/context.
Effectively what we found was that opting out of cgroups memory constraints meant that we would still cause the filesystem cache to get evicted in order to make room for our memory allocations but with significantly lower latency. I suspect that what might have been happening is that when cgroup-memory-bound, we faulted in a disk-backed page and that triggered logic that decided it needed to evict other pages in order to make room for this one. So far, so good. But then perhaps it got opportunistic and figured "since I can't know that I need to background-evict these pages I had better foreground-evict as many as I can now." This is all wild speculation on my part. But what was clear was that our process would suffer a multi-second stall while evicting pages. Disabling cgroups-memory constraints meant that we would not hit this same condition.
More background: our application would read mapped files into memory, transform the data then write it out over sockets. The file-reading-task would read files whose size exceeds the total system memory. All files were opened read-only, the pages were never dirtied. There were no swap devices.
I don't recall all the specifics but the gist we got from the distro was that "yeah, it works that way and it's not a bug."
All I wanted to do was make it so that the greedy process could eat up no more than half of system memory with stuff like the fscache or dumb allocations. That would mean that other processes would be much less likely to stall when doing memory allocations.
And I wish Sun had used a different (more widely accepted) license when open sourcing stuff, so that we would be using an enhanced version of Solaris Zones.
I don't know, but I remember reading from the involved Sun people that this wasn't the plan. Bonwick and Cantrill I believe. In the end what matters is the result more than your intention, when it comes to legalese. Sun could have gotten back to its former glory, but the lawyers somehow decided, maybe knowing they'd like to sell the company, that they needed a new license (CDDL). Or, you know, we're all just speculating and have no idea anyway. The end result for the wider tech ecosystem is painful. Had BTRFS been liberally licensed, it might be shared just as the graphics DRM bits are between Linux and the BSDs, and we would have a more mature BTRFS, maybe...
Don't think this has anything to do with anything plan9 can do that Linux can't already.
The barriers to enabling the magic plan9 has are more along the lines of making it feasible for unprivileged users to mount and create namespaces themselves, which is complicated (afaik) largely by the way privilege escalation happens in unix systems.
This just looks like it prevents some information from leaking out about the host system from inside a cgroup by inspecting the /proc/PID/cgroup file.
To a first approximation... zero. That is to say, I don't know any (and I've looked a bit) and the realities of the limited hardware support make it difficult to imagine. I don't know of anybody using Inferno either, but I'm less familiar with it.
I'd also dispute the suggestion that Plan9 is more advanced than Linux. It's different and Plan9 definitely does some interesting things, but that doesn't automatically make it more advanced. Ignoring that "advanced" is hard to quantify, I'd more that Plan9 is simply based on different concepts. It's like comparing Windows and Linux, albeit maybe not quite so extreme as Plan9 and Linux have more in common and Plan9 is simply not as heavily developed as either OS.
Coraid used to build and sell systems based on Plan9, but I believe they don't exist anymore, though I don't mean to imply Plan9 was at fault, quite the contrary.
Thanks for the link. I was surprised to learn that the BATMAN protocol had reached mainline Linux, and version 5 no less. I hadn't realised quite how healthy the development of this protocol was.
May be a noob question, but is Linus himself review all these commits before merging to mainline? Or each section (for eg: networking) got its own owners?
> This release adds Kernel Connection Multiplexor (KCM), a facility that provides a message based interface over TCP for accelerating application layer protocols. ... a common use case is to implement a framed application layer protocol running over TCP ...
This was discussed on HN a while back. Gee, it's too bad that SCTP hasn't found more popularity. Reliable datagram-based (or stream-based) messaging, with support for binding to multiple endpoints.
In any case, KCM sounds like it's worth exploring.
My thought exactly. I have always wondered why SCTP was not on the same level as UDP and TCP. It's really too bad, since it would be perfect for lots of higher level protocols, especially ones that don't require streaming (SMTP for example).
KCM seems really neat, though of course now we need a library that abstracts it away and has a fallback to regular TCP so that applications don't become Linux-only.
USB 3.1 SuperSpeedPlus (10Gbps) support - this is by far the most important feature in my opinion, considering that USB Type-C (3.1) is projected to be the most abundant socket on the planet.
I wish they would finally find a solution that actually fixes USB storage sync lockups. Whatever USB storage you use and whether USB2 or USB3, it's very easy to lock up a machine when data is flushed and is slow to do so. Given all the asynchronicity in the kernel, it's surprising there's such a Windows-like global wait-until-I'm-finished unresponsiveness bug.
I've never experienced the entire machine locking up before (going back 10+ years), only IO becoming severely locked up when writeback can't keep up with new IO, and always happened on non-USB media too.
while sleep 10 do ; cat /proc/meminfo | grep Dirty ; done
Watch the dirty cache data when this is happening, if you want to know for sure. Put a couple "sync;sync;sync"'s &&'ed on that cat, if you want to watch stimulating behaviour..
(More often than not, its the device. Linux is dealing with it fine.)
As I wrote above, the device is taking long to finish the flush operation, and that's nothing special, but making the machine not react to input and actually block everything else like, say, the network stack and thereby interrupt downloads, is not what I perceive as fine.
Linus manages the kernel and should decide to include it as BFQ is just another option in a pluggable subsytem. BFQ is arguably more important than many things because it fixes bad performance and unresponsive behavior. I hope it's not that Jens Axboe has too much influence because he's the block layer maintainer.
Deadline so far in limited testing doesn't seem to cause lockups as CFQ does and it seems to more realistically reflect the workload in the form of lower MB/s during such phases. Thanks for the tip.
With blk-mq + nvme and later lightnvm that whole IO-scheduler mess will hopefully go away over the course of the next couple years (at least for personal computers).
I recall seeing something like that when input devices were living on the same hub as the storage device getting IO at the time. But that was years ago, and the USB stack has seen considerable work since.
Edit: never mind, Johnp_'s comment and bugzilla link is highly interesting.
It feels like there was too many cooks involved this time round, trying to make the USB spec go in multiple directions at once.
Where before port and speeds were joined in one spec, now you have no less than 3 specs that can be mixed as the device OEM sees fit.
First is the 3.1 data spec, that can work on both A and C plugs.
then you have the C plug spec, that as you point out do not require 3.1 data.
Then there is the Power Delivery spec, that is all about turning USB into a generic power delivery cable. And that again can be used with both A and C plugs, and with any and all data speeds.
Would you mind expanding upon this a little? I feel like there's a wonderful nugget of history here just waiting to be rediscovered by the broader hacker news community. Sounds like it could make a good story?
To add to the other posters, Slackware also included other Subgenius recruiting material in the source tree in /ap/gonzo. They stopped doing so for slackware 8.1, sadly. A link to the last version's material
They used to put a Dobbshead on the CDs, way back in the day. I remember my dad's Slackware CDs had the photos on them; I always thought they looked like Mr. Cleaver, naturally, since I watched that show as a kid.
I don't know what parent is talking about w/ README files. But Slackware Linux (slack... get it?) includes a link to the church in their install.end file. They started scrambling it in a recent release.
It occurred to me recently that if you were accused of breaking someone's rot13 'encryption', you could claim double-encryption to dodge the charge. Assuming they aren't thrown out of court for calling rot13 "encryption", of course.
It's actually amazing how many soft devs have no clue about the impact of the Linux, the command line, which they now call the 'CLI', true editors and fast home grown IDEs. A shift needs to take place at the core of software development. Linux made everything possible.
Ahh yeah, it was all Linux. The 40 years worth of computing history and foundational operating systems before Linus came on the scene had nothing to do with it.
Sarcasm noted, but think back to 1991. We had a half dozen mostly incompatible desktop OS's and around a dozen Unix-like variants (mac, pc, amiga, cpm, etc), mostly with proprietary hardware (hpux, aix, sun, apollo, dg, dec, etc). The internet and decent interop protocols were just getting started. Remember DCE and RCP? Bleh).
Linux now runs on almost _all_ of those with a mostly compatible API and ABI, from raspi to Z-mainframe, speaks to hundreds of weird hardwares, and hundreds more network protocols built in.
So does BSD though. To my mind it's accidents of history (in particular the AT&T lawsuit) that lead to Linux being the thing, and if it hadn't been Linux it would've been something else. (Indeed I believe Torvalds said that if 386BSD had been going around Helsinki at the time there would never have been a Linux).
Absolutely true about the war that lead to the Linux mainstreaming.
And btw, is there a list of such points in computing history, where a slightly small change in a decision would have lead to a different computing environment ( i.e. "forks in time") ? This would be a very good read. And a nice practical introduction of chaos theory.
One such moment was when IBM decided to sign the deal with Microsoft, giving them exclusive license on operating systems for IBM PC. As someone put it: Were there one person with a brain at IBM to proof-read that deal, Microsoft would never become such a big company as it is today. But there wasn't any, so we ended up with Windows at 95% of PCs worldwide.
lets be clear, the impact of linux has been great, in many ways. I compiled my first linux kernel in 1994, and I don't think I'd be the developer I am today without having had access to and experience with Linux. But come on, the tenor and tone of the parent comment is incredibly myopic, ignoring decades of legacy that Linux imitated. Its also a bit much to attribute the trend of hardware compatibility and binary independence (which is quite a bit sketchier IMO) to Linux. This was a trend that has been ongoing for years, and would have without the introduction of linux. TBH we probably have Microsoft to thank for that more than anyone, by setting expectations across the consumer market that hardware be mostly-compatible with an OS.
I think you're reading too much into the comment. Your argument is that they're ignoring history. Just because there was something before Linux doesn't mean Linux hasn't changed things. The point about Linux seems valid to me, seeing as it's the dominant kernel in nearly every market.
You go on to praise Windows, as if it's okay to ignore history when it comes to MS.
Eh... I think people in HN get overly offended when people state the fact that Linux has been historically revolutionary.
> seeing as it's the dominant kernel in nearly every market.
That's disingenuous, not every market has the same weight. Microsoft owns the desktop market, and there are many countries where it also owns the enterprise/server market.
> people in HN get overly offended when people state the fact that Linux has been historically revolutionary
There is a big difference between saying "Linux was revolutionary" (considering the impact it had, absolutely true), and "Linux was revolutionary from a technical standpoint" (it wasn't).
How was linux revolutionary? Unices had been around for years before that, and IIRC even the choice of a monolithic kernel vs a microkernel architecture was seen as a somewhat conservative choice. Culturally, its not like free software hadnt been a movement for a decade already. IMO Linux's place in history was accidental, occuring because of the confluence of a number of external factors and trends. As another poster noted, if it hadn't been Linux, it would have been BSD or something else entirely. (All of which is not to downplay the value and contributions that Linux did make. But lets keep it in perspective.)
Downvoted because...? *nixes are a lot of things, but good from an OS design standpoint is not one of them.
AS/400. VMS. Plan9. Amiga. QNX. Even NT. All of these date to before Linux was created. (Strictly speaking, I believe Plan9's first public release was after Linux's, but work on Plan9 started at Bell in the mid 80s.)
VMS and NT handle async IO much better. QNX, and to an extent NT, are more reliable¹ by design (as microkernels²). AS/400 and Plan9 are based on more powerful and coherent abstractions ("everything is an object" and "everything is a file" respectively).
--
¹The NT kernel really is quite reliable, it's everything else that gives Windows a bad reputation for reliability.
²NT isn't strictly a microkernel, but it's very microkernel-ish.
> It's actually amazing how many soft devs have no clue about the impact of the Linux, the command line, which they now call the 'CLI', true editors and fast home grown IDEs. A shift needs to take place at the core of software development. Linux made everything possible.
Ok.. reluctantly, I'll bite.
Why does a shift "need" to take place? What do you believe is broken and how is a developer "understanding the impact" of Linux going to fix it?
Also, what is a "true editor" and conversely what makes an editor not "true"? Do you have examples? Can software development only be done in a "true" editor?
It seems to me that the poster is not a native english speaker ("soft devs", "the Linux", etc.), and may have intended to say that it "needed" to take place, and that Linux helped bring it about.
Totally agree. People forget that Linux is the dominant kernel/os because it dominates nearly every market that isn't the desktop (server/mobile/IoT/toasters). On stack overflow dev survey they show about 21% of devs on SO use Linux, compared to regular desktop usage which is a generous 2% that's a 10-fold increase of Linux on the desktop among developers compared to the general populace.
OS X also experiences about a 2-fold increase whereas windows experiences a large decrease in use amongst developers compared to its general desktop usage. I think Windows should just switch to the Linux kernel and build Windows container to run legacy programs and port all their software to Linux if they want to win over devs. The "bash on Windows" doesn't appeal to me at all.
Of course, here on hacker news the culture is really more corporate than hacker.
People forget that Linux is the dominant
kernel/os because it dominates nearly every
market that isn't the desktop
People also forget that Linux came to prominence once backed by corporations having and a (valid, IMHO) agenda to not be controlled by Microsoft. The severe OEM and license agreements negotiated in the late 80's and 90's certainly motivated R&D into alternatives.
And people forget that there are other very popular markets[0] using other toolsets[1].
So, when you say:
here on hacker news the culture is really
more corporate than hacker.
> People forget that Linux is the dominant kernel/os because it dominates nearly every market that isn't the desktop (server/mobile/IoT/toasters).
This is mostly correct while referring to the kernel. Nevertheless, a kernel by itself is just one component of the low level system. And I personally definitely do not consider the Android experience a "GNU/Linux" one - even the base libc is replaced with a quite broken (e.g wrt POSIX) Bionic. This is leaving aside all the firmware issues, lack of administrator control without "rooting", etc.
So in other words: it is natural to "forget" that Linux is widespread on mobile devices, simply because the experience is extremely different.