Hacker News new | past | comments | ask | show | jobs | submit login
Linux maintains bugs: The real reason ifconfig on Linux is deprecated (2018) (farhan.codes)
245 points by pjmlp on March 19, 2020 | hide | past | favorite | 133 comments



It's not quite that simple. People actually tried to update net-tools ifconfig (one of the two ifconfigs back in 2018).

In 2019, I simply wrote an ifconfig that uses the netlink API, because for portability I needed a Linux ifconfig that had a FreeBSD-like command-line syntax. It doesn't have the old interface aliasing mentioned in the headlined article, and has no problems setting up and reporting IPv6 addresses or additional IPv4 addresses.

* https://unix.stackexchange.com./a/504084/5132

* http://jdebp.uk./Softwares/nosh/guide/commands/ifconfig.xml

I have a system to convert Debian /etc/network/interfaces into a BSD-style rc.conf, which then in turn gets converted into a suite of native service bundles. One of the service bundles is ifconfig@interface , which invokes ifconfig with a uniform command-line syntax across Linux and FreeBSD, hence the need for a more FreeBSD-alike ifconfig on Linux.

* http://jdebp.uk./Softwares/nosh/guide/rcconf-amalgamation.ht...

* http://jdebp.uk./Softwares/nosh/guide/rcconf.html

* http://jdebp.uk./Softwares/nosh/guide/networking.html


Thanks for nosh.

Another off-topic question (using this forum as your author page is quite discouraging about sending you emails ....)

What's the reason for naming cyclog files with the time they were closed? (other than that's how djb did it?), since this information is available easily with 'ls -l'?

I use a logger that opens a $CREATE_TIMESTAMP.u file, and makes "current" a symbolic link to it; and on close renames it to ".s", which I find more intuitive and informative (also provides time between log creation and first message, which is unavailable with cyclog). I was considering changing the ".u"/".s" marking to be an attribute (-w or +i or +t to signify "safely closed") to help repeated rsyncs of log directories, but haven't done that yet.

I'm sure there's some useful aspect of existing cyclog/multilog behaviour that I'm missing, but I haven't been able to figure what it is for the last 10 years ...


If you want more than just my perspective, look to Laurent Bercot's Supervision Mailing list. M. Bercot is of course there, as well as a few other people.

* http://www.skarnet.org/lists.html

As for the filenames, yes the original reason was for compatibility with multilog, and of course all of the other tools that can process these log directories. I did set out, after all, to produce a cyclog workalike, with lessons from multilog incorporated. Note that said tools might not deal well with log directories where the same log file is available via two names, both "current" and something else.

Moreover, if you go and look at follow-log-directories you can see an additional benefit that it very efficiently skips its cursors over old files without having to look at their i-nodes because it knows that the filename is guaranteed to be a timestamp at or after the last log entry in the file.

Furthermore, if you go and read the cyclog manual, you'll find out why using the w permission bit had problems. (-:

1.41 is building up changes, by the way.


> Moreover, if you go and look at follow-log-directories you can see an additional benefit that it very efficiently skips its cursors over old files without having to look at their i-nodes because it knows that the filename is guaranteed to be a timestamp at or after the last log entry in the file.

Thanks, that's two thing I've missed (because I never used it, I guess ..): This optimization and follow-log-directories+export-to-rsyslog which was introduced at a time I wasn't looking :)

> Furthermore, if you go and read the cyclog manual, you'll find out why using the w permission bit had problems. (-:

:) Indeed, thanks.

> 1.41 is building up changes, by the way.

Thanks! And since I've missed follow-log-directories (and export-to-rsyslog) before, maybe there's another tool I've missed that would make shipping log directories more efficient? I'm concerned with very-low-bandwidth, very-intermittently-connected servers in which rsync is a godsend and e.g. rsyslog and friends proved much less reliable and efficient; And the renaming of "current" makes it miss the fact that it already has an (almost complete) copy of said file under a different name.

Thanks again!


I don't know off the top of my head, but then I don't know all of the tools that exist for multilog/cyclog log directories.

This is, of course, more of an rsync problem. And it seems that there are approaches to having rsync detect renamed files and handle them more efficiently.


> People actually tried to update net-tools ifconfig (one of the two ifconfigs back in 2018).

What happened?


See the StackExchange answer.


It's a comprehensive reply -- good stuff, and worth a read.

The short version: > "As you can see, the GNU inetutils and NET-3 net-tools ifconfigs have some marked deficiencies, with respect to IPv6, with respect to interfaces that have multiple addresses, and with respect to functionality like -l.

> The IPv6 problem is in part some missing code in the tools themselves. But in the main it is caused by the fact that Linux does not (as other operating systems do) provide IPv6 functionality through the ioctl() interface. It only lets programs see and manipulate IPv4 addresses through the networking ioctl()s.

> Linux instead provides this functionality through a different interface, send() and recv() on a special, and somewhat odd, address family of sockets, AF_NETLINK.

> The GNU and NET-3 ifconfigs could have been adjusted to use this new API. The argument against doing so was that it was not portable to other operating systems, but these programs were in practice already not portable anyway so that was not much of an argument.

> But they weren't adjusted, and remain as aforeshown to this day."


That split where ioctl() is used for ipv4, and the send()/recv()/AF_NETLINK setup for ipv6 is interesting. I'm curious why the Linux folks chose that route. I'd get it as a temporary thing, but shouldn't they have migrated everything onto AF_NETLINK?


That's exactly what happened. Nobody bothered rewriting ifconfig to use the new APIs, because `ip` was the new hotness (and much more flexible and if not saner, at least differently insane).


Off-topic: I can't help but be nosy and check the https version of your page. You seem to be serving a cert with CN=albertstreetantiquescentre.co.uk. Might want to get this looked at :)

Edit: Interesting,

https://jdebp.uk has the correct cert configured, whereas https://jdebp.uk. serves the cert as mentioned above.


So, firstly don't put a dot at the end. Whatever you might feel about it, the decision in practice was that the host part of the URL is an FQDN but doesn't need a dot at the end. So don't write one there. The browser should probably trim it out when you type in or follow such a link.

The Apache web server that the bulk host is using for that site has a fairly poor implementation of HTTPS. It's unsatisfactory in various ways, some very technical, but what you're seeing here is the weird way it handles SNI.

By default when you say you want jdebp.uk. and there is no such virtual host (there's only jdebp.uk with no trailing dot) Apache will pick a default virtual host and act as though you asked for that, sending over the certificate for that host and so on. In this case, at a bulk hosting site, it's some unrelated Antiques Shop outfit, likely because that begins with 'A' which is the first letter of the alphabet.

This actually caused a problem for Let's Encrypt because their tls-sni-01 challenge type assumed nobody would be crazy enough to answer queries for a random name with a certificate chosen by somebody else entirely. But that's exactly what Apache is doing here. The current tls-alpn-01 challenge was introduced to fix that - it asks for a reserved ALPN service instead of a spurious SNI server name.


you can configure the first default apache vhost to be something else other than a "customer" site, that's what we do at my hosting company. though it still leads to a certificate error. that used to matter, but nowadays we just automatically lets encrypt everything that we host.


I wonder how much trouble this causes for censys.io as I believe they pull certs based off of ip address requests.


Censys gets these defaults (you could probably measure a slight bias in their data towards alphabetically earlier names as a result, but I have never checked) for sites using Apache, and for other servers with a default they get that default (in nginx I believe you have to explicitly configure a default if you want one). If there is no default the site says you need to specify which name you wanted or too bad.

You'll see the same for various other providers that crawl the IPv4 Internet looking for services. My last job (it seems prescient now to have no job, very convenient in current situation to have no reason to go outside anyway) had me trying to fit some of this data to other service discovery information.


It seems that the server running on the other end doesn't properly match the certificate from the SNI if the domain sent by the browser is a FQDN (i.e. ends in a .)

For example, try the following:

  openssl s_client -connect jdebp.uk:443 -servername "jdebp.uk."
vs

  openssl s_client -connect jdebp.uk:443 -servername "jdebp.uk"
You'll notice that in the first case the default certificate is sent back, in the second case the certificate is correctly matched against the SNI.

According to the Server header returned:

  Server: Apache/2.4.41 (cPanel) OpenSSL/1.1.1d mod_bwlimited/1.4 Phusion_Passenger/5.3.7
Testing against an NGINX based server, I am not seeing the same results, in fact I am seeing the correct certificate being returned.


The relevant RFC says not to send a trailing dot in SNI. Browsers should probably trim the dot out if present. Maybe they aren't doing so because it causes some unexpected compatibility mishap, maybe in reality it rarely causes any trouble so nobody got around to it.


https://tools.ietf.org/html/rfc3546#section-3.1

> "HostName" contains the fully qualified DNS hostname of the server, as understood by the client. The hostname is represented as a byte string using UTF-8 encoding, without a trailing dot.


I was not aware of this!


That's the page on a shared hosting service, not served by me. See the site history page (still in the same place that it was a couple of years ago when it came up on Hacker News).


It's due to Apache not correctly matching the SNI when it is sent a FQDN ending in a period (.)

More information: https://news.ycombinator.com/item?id=22632959


Why are you using the dot for your site but not for stackexchange?


Lazy copy and paste on my part. (-:


This might be the first time in quite a few years that I've seen people using a dot like that. Any reasons to do so? I figured most people do not have weird DNS search domains or some such to warrant that dot.


It would be good practice to do so, when you are referring to a FQDN.

Apart from that, to circumvent badly configured paywall filters?


You both need to read http://jdebp.uk./FGA/web-fully-qualified-domain-name.html , including the Hacker News person's reminiscences. (-:


If you'd read the article you'd see that it talks about how people tried to update ifconfig.


This is highly biased opinion piece. I like BSD. I liked pre-Oracle Solaris. And I like Linux at lot too.

And it is factually inaccurate. Deprecation of ifconfig is a distro issue. "Linux" did not deprecate it. And the reason is that it was unmaintained, not that it couldn't handle multiple IP addresses. In fact it can, I still use it 50% of the time. Works fine for setting and displaying multiple addresses.

The Debian maintainers wanted to drop it, and the other distros all followed suit a couple years later. Good information and links here:

https://serverfault.com/questions/633087/where-is-the-statem...

"Linux maintains known bugs – and actively refuses to fix them. In fact, if you attempt to fix them, Linus will curse at you, as manifest by this email."

That is a bold statement but is not at all supported by the linked email. You'll see Linus cursing at a bad patch that would break something in userland, not for "maintaining a bug". He's preventing at least two bugs (unnecessary change to behavior and ENOENT is indeed the wrong return value). I think I liked the old Linus better...

So it's a shame. I'd love to read a series about BSD, but this is really just inaccurate propaganda.


If you follow the email thread through to the end, nobody was trying to intentionally break userspace. An internal error code was being leaked to userspace accidentally. That was the bug. The original patch was actually fixing a different issue, where some V4L2 devices were returning the wrong error codes! So userspace was already seeing an inconsistent API.

I don't like this email thread, because some people are clearly trying to carefully find the breakage and fix it, but Linus yelling at people is all that people ever see from it. So the conversation changes from fixing a bug to insults and anger.


> And the reason is that it was unmaintained, not that it couldn't handle multiple IP addresses.

Unmaintained sounds like a very good reason to me. It's not just whether there is 1 or 2 addresses per interface. The rot set in when Alexy re-did the Linux kernels network infrastructure back god knows when (its decades ago now), and wrote his own user space tool to configure it. That tool was ip. It not only handled multi ip addresses per interface - it handled multiple routing tables, multiple link types, routing rules and a whole pile of other things. And that single tool has been growing ever since - it's now huge. Type typing "man ip-<tab><tab>" on a cli that supports tab completion some time.

Effectively the task of keeping up with the kernel's new features kept growing and at some point the ifconfig maintainers gave up. It was all over bar the shouting at the point. The only surprising thing is the shouting has gone on for literally decades, and as this article demonstrates continues to this day. Lets hope the same angst over systemd doesn't continue for the same amount of time.

Just like like systemd there are things that rub me the wrong way of course. Coping the Cisco IOS style of command line usage rather than using the Linux style is for me one of those things. But it happened for a reason - Alexy set the standard when he wrote the first version of ip, and he was trying to emulate the features provided by a Cisco router at the time.


That's a great theory, but Linus picks and chooses when to decide it's outrageous to make changes that break userland tools based on his personal preferences. They intentionally broke ZFS - then Linus follows up with a wildly inaccurate statement about why he thinks nobody should use ZFS.

It's difficult to take him seriously when he picks and chooses when to enforce his supposed "never break userland" ethos.


Did you read his mail though? He literally answers this criticism. ZFS is not userland so kernel developers are free to break it, just like they can break ext4 and do, and then they fix them. The problem with ZFS is that they can't fix it because it's not part of linux. It's a 3rd party kernel module. So your entire comment reads like a misunderstanding of the situation.


The breakage of ZoL wasn't a side-effect of some big technical change. They literally just made the SIMD enable/disable barriers no longer usable from non-GPL modules (the changed an EXPORT_SYMBOL to EXPORT_SYMBOL_GPL). This means you can't use SIMD-accelerated hashing functions from ZoL -- and there is no way to fix it without hurting performance other than reverting the "license change". And the entire justification for this change was "there are no in-kernel users".

Now, they are obviously free to do that and ZoL needs to work around it. But it's silly to argue that this was not an intentional breakage of a system which many people use and cannot be worked around out-of-kernel. Yes, it's not userspace code but it did cause regressions for users. Linus and GregKH have always disliked ZFS because they feel it was designed to be incompatible with Linux (and they're entitled to that view, however wrong it may be) so it's unsurprising that they are against doing anything that may help ZoL.


> ...disliked ZFS because they feel it was designed to be incompatible with Linux...

Ya... That may have just a bit to do with the fact that ZFS is released under a licence that was explicitly designed to be incompatible with Linux.

"Mozilla was selected partially because it is GPL incompatible. That was part of the design when they released OpenSolaris. ... the engineers who wrote Solaris ... had some biases about how it should be released, and you have to respect that."

I can see how that might suggest to Linux maintainers that they are not welcome to use this code. Maybe sorta.


Please stop spreading this. Simon Phipps, Sun's Chief Open Source Officer at the time, and the boss (?) of the person that you quote, has explicitly said this was not the case:

* https://marc.info/?l=opensolaris-discuss&m=115740406507420

Sun needed a file-based license that had patent provisions. There were none available at the time so they created their own. Given that CDDL-license technologies (Dtrace, ZFS) have been incorporated into other open source (BSD) as well as closed source projects (macOS) shows that it is quite accommodating.


The incompatibility comes from GPL side terms. As for it being intentional, there are only anecdotes, and conflicting ones. When first CDDL sources dropped (DTrace) management expected to see it incorporated into Linux within month


Thing is, the copyright is owned by Oracle of all companies and and the legality is more than questionable enough to allow Oracle to sue for damages the second the code shows up in use by a big enough fish.

Oracle won't relicense the code (and lose out on a chance at another software copyright lawsuit), and it won't ever be safe to touch without an ironclad licensing story.

You can hope that Oracle folds and gets bought by someone decent. But hope won't take you that far.

Throw it away. Start over from scratch. The code may be great, but it's gone. It's a shame it ended up where it did. Take a moment if you need, but then move on. ZFS isn't going to happen.


Except Oracle isn't the copyright holder for majority of the code, as in fact, it cannot relicense huge chunks of OpenZFS code (which indeed is used in big commercial products), and it cannot "take back" code licensed under CDDL - it can relicense their own copy, not the one in OpenZFS, because there's no "version 1 or newer" clause that allows backdoor change to license like typical GPL case.

To summarize: - Oracle isn't the only copyright owner - Oracle can't take back CDDL license - Oracle can't introduce new version of CDDL and magically change the rules


> because there's no "version 1 or newer" clause that allows backdoor change to license like typical GPL case

There is actually, and in CDDL (unlike in GPL) version updates are an opt-out feature rather than being opt-in. This was inherited from the MPL (which the CDDL is based on). See section 4 of the CDDL:

> 4.1. Oracle is the initial license steward and may publish revised and/or new versions of this License from time to time. Each version will be given a distinguishing version number. Except as provided in Section 4.3, no one other than the license steward has the right to modify this License.

> 4.2. You may always continue to use, distribute or otherwise make the Covered Software available under the terms of the version of the License under which You originally received the Covered Software. If the Initial Developer includes a notice in the Original Software prohibiting it from being distributed or otherwise made available under any subsequent version of the License, You must distribute and make the Covered Software available under the terms of the version of the License under which You originally received the Covered Software. Otherwise, You may also choose to use, distribute or otherwise make the Covered Software available under the terms of any subsequent version of the License published by the license steward.

In fact this is the primary argument that people give when arguing that Oracle could very easily make OpenZFS GPL-compatible -- they just need to release a CDDL v2 which says "code under this license is dual-licensed under the GPLv2 and CDDLv1.1". This situation has already happened -- CDDLv1.1 used this mechanism to change the "license steward" from "Sun Microsystems" to "Oracle".


Updates in GPL, thanks to standard boilerplate provided by GPL and used for years, are opt-out.

With CDDL, Oracle declares they have the sole right to provide newer versions of the license. However at no time can they treat it as an "upgrade path" for third party code, and OpenZFS code is explicitly labeled with CDDL 1.0

All of that has no impact on OpenZFS code which remains unencumbered, including by patents (the patent license is, afaik, the part that makes it incompatible with GPL the most).


ZoL already worked around it months ago


ZFS is not user land. It's kernel code but without being in the kernel. So it has to be built separately and may not even be compliant with GPL (still has to be tested in court AFAIK). Linux's promise to not break userspace is just that, It can not be held responsible for code that should be in the kernel but does not play fair with the license.


It's not that it doesn't play fair. Sun published under a license that made inclusion in the kernel impossible thus the current developers are bound by the same decisions regardless of their opinion on the matter.

Furthermore it is profoundly curious to claim that a cross platform filesystem that began life on Solaris and existed for years as a stable complete work before being ported to several OS including Linux becomes by virtue of a bridge a derivative work of Linux. It's magical thinking.

It is entirely acceptable that they aren't responsible for out of tree code working but it would be great if they didn't deliberately sabotage other projects which is what they did here.

>My tolerance for ZFS is pretty non-existant. Sun explicitly did not want their code to work on Linux, so why would we do extra work to get their code to work properly?

The extra work is 10 seconds of work. How childish and unprofessional.


> The extra work is 10 seconds of work. How childish and unprofessional.

While Oracle keeping the code under incompatible license for more than 10 years is totally OK.


What does it have to do with the zfs developers who no longer work for oracle?


It has to do with people barking at the wrong tree.


> Furthermore it is profoundly curious to claim that a cross platform filesystem that began life on Solaris and existed for years as a stable complete work before being ported to several OS including Linux becomes by virtue of a bridge a derivative work of Linux. It's magical thinking.

please don't confuse the source code and the resulting binaries. to the extent that the zsh source code does not derive from Linux, it is not a derivative work.

but the binaries built for Linux will necessarily be derivative of both.

and we regularly redistribute binaries; this indeed is what Linux distributions ordinarily do.

if you only write the bridging code and/or distribute the module source, you may be quite protected. but if you distribute a patch from the mainline to include zsh, or a mixed tarball of Linux+zsh, or even compiled Linux kernels complete with a loadable zsh module, you may be in breach. all of these clearly derive from multiple sources with incompatible licences.


> Furthermore it is profoundly curious to claim that a cross platform filesystem that began life on Solaris and existed for years as a stable complete work before being ported to several OS including Linux becomes by virtue of a bridge a derivative work of Linux. It's magical thinking.

It's not magical thinking. A work can be a derivative of multiple other works. It's completely plausible that adding Linux stuff to ZFS can make it a derivative work of Linux in addition to being a derivative work of whatever else it is already a derivative work of. There's room for debate about how much Linux stuff (and of what nature) something like ZFS can incorporate before it becomes a derivative work according to copyright law and the GPLv2, but you cannot dismiss the idea outright.


Magical indeed. Except that public opinion counts for nothing in copyright law; you have to fight it out in court. Luckily, Sun isn't the kind of company that would file questionable lawsuits against Linux users just to... wait.. wasn't Sun bought by another company?

Well, whoever owns the copyright now, you're probably reasonably safe unless they're a hyper-litigious corporation with deep pockets and an army of lawyers, who see lawsuits about software licensing as a potential source of income....

I mean, what are the chances?


> Furthermore it is profoundly curious to claim that a cross platform filesystem that began life on Solaris and existed for years as a stable complete work before being ported to several OS including Linux becomes by virtue of a bridge a derivative work of Linux. It's magical thinking.

Let's say I make a movie M. It's a nice movie, but it doesn't have music. Someone else wrote some music S that is quite lovely. If anybody takes my movie M and mixes in music S, then the new movie is a derivative work of both M and S.


For a scary few days he was considering completely breaking the ABI of getrandom(2). If he decides to break interfaces, I don't know that anyone can stop him. It is a downside to the BDFL model.

Edit: Breaking ZFS is not a great example — as sibling comment points out, that is a kernel module using kernel-internal APIs, not userspace syscall ABI. The Linux kernel notoriously considers all kernel interfaces unstable. GregKH has in the past characterized this as a feature, not a bug.


> The Linux kernel notoriously considers all kernel interfaces unstable.

You think you want a stable kernel interface, but you really do not, and you don't even know it. What you want is a stable running driver, and you get that only if your driver is in the main kernel tree.

— Greg Kroah-Hartman <greg@kroah.com>

https://www.kernel.org/doc/html/latest/process/stable-api-no...


I think that it is a upside of BDFL model, unless it's someone incompetent. If you dig deep in the threads you'll find that so far his judgement on what to break or not has always been stellar. As long as you have someone as competent as Linus making the call, BDFL model is perfect.

RE getrandom(2): As Linus said several times, we don't break userspace rule does not apply to security bugs. If there's no other way to fix security bug other than breaking userspace, then userspace has to be broken, for obvious reason. If you don't understand that reason, you shouldn't really have a say in any such critical decision.


> unless it's someone incompetent.

Unfortunately, Linus is dangerously incompetent at cryptography. As was the case with getrandom(2).

> RE getrandom(2): As Linus said several times, we don't break userspace rule does not apply to security bugs.

I don't think you understand or correctly recall the scenario in which Linus nearly broke getrandom(2). The bug Linus was attempting to fix by breaking getrandom(2) was not a security bug. It was an availability bug, in userspace, caused by broken userspace code deadlocking itself.


> Unfortunately, Linus is dangerously incompetent at cryptography. As was the case with getrandom(2).

I beg to differ. Of all people I know of, Linus is one of the most competent person when it comes to cryptography, more so than any of those Twitter celebrities. I suggest you dig into LKML archives. Exactly what in the case of getrandom(2) do you think makes him incompetent?

> I don't think you understand or correctly recall the scenario in which Linus nearly broke getrandom(2). The bug Linus was attempting to fix by breaking getrandom(2) was not a security bug. It was an availability bug, in userspace, caused by broken userspace code deadlocking itself.

I recall correctly. Optimizations in ext4 unveiled broken design of previous getrandom(2). To keep backward compatibility an easy choice could have been to supply early userspace with possibly predictable random numbers, which, of course would be a security issue. Another would have been to keep the bug hidden and revert any optimization that lowers early entropy pool. In the end, the solution at which the community arrived is best compromise given the constraints. The kernel is one of the few places where you can real software engineering to take place.

I think your comment is driven more by personal hatred for a great mind rather than technical merit.


> Of all people I know of, Linus is one of the most competent person when it comes to cryptography

This statement strikes me as bizarre because it's just so off-base. It seems you relate to Linus as some kind of personal hero, which is fine, but then use that as a reality-distortion field, which is probably unhelpful for you. (It's just confusing to others.)

To the best of my knowledge, Linus has not done any significant work in cryptography, nor is he regarded by any practitioner as an expert in the field. I would love to learn about any work he has done; I am just not aware of any, and I have some experience in the field.

> Exactly what in the case of getrandom(2) do you think makes him incompetent?

getrandom(2) is the only good random number ABI in Linux. It is the only way to request real random numbers that blocks until initial seeding and never blocks afterwards (i.e., BSD /dev/random behavior). It has a flag that can be used to avoid blocking.

Linus seriously proposed changing the ABI for getrandom() with flags specifying "please block until real random numbers are available" to silently ignore those flags if the routine would block, and instead return garbage. This would break all existing software that used getrandom() correctly and would be especially egregious for anything using getrandom(2) for cryptographic keys — which is a good and correct use of the interface.

His proposal was fundamentally similar to, for example, changing the ABI of /dev/random to silently return /dev/zero output instead of blocking.

> I recall correctly. Optimizations in ext4 unveiled broken design of previous getrandom(2).

Nope, you've still got it wrong. https://lwn.net/Articles/800509/

1. The design of the getrandom(2) API was and continues to be fine. It is exactly what was missing on Linux, and has room for future expansion via additional flag bits. It is not broken.

2. The getrandom(2) ABI had been in the kernel for five years at the time Linus proposed breaking it.

3. The bug was in GNOME Display Manager, which ran early in boot on the particular system that lead to the report, and GDM managed to deadlock itself by incorrectly invoking getrandom(2) with a blocking request for entropy that it did not need.

4. (Broken user code can also deadlock itself by calling, say, sleep(99999).)

5. Yes, an ext4 optimization lead to slightly reduced initial entropy on that particular machine and the sighting of the underlying deadlock condition in GDM. But that latent condition was always present.

> To keep backward compatibility an easy choice could have been to supply early userspace with possibly predictable random numbers, which, of course would be a security issue.

This (1) does not keep backwards compatibility, and (2) is exactly the approach Linus initially proposed and had to be talked down from. This is the proposal that was dangerously incompetent. From the thread at the time:

Andy Lutomirski:

  There are programs that call getrandom(0) *today* that expect secure
  output…  We can't break this use case.  Changing the semantics of
  getrandom(0) out from under them seems like the worst kind of ABI
  break -- existing applications will *appear* to continue working but
  will, in fact, become insecure.
Matthew Garrett (various emails):

  We've been recommending that people use the default getrandom() behaviour for key generation since
  it was merged. Github shows users, and it's likely there's cases in internal code as well. …

  The semantics many people want for secure key generation is urandom, but 
  with a guarantee that it's seeded. getrandom()'s default behaviour at 
  present provides that, and as a result it's used for a bunch of key 
  generation. Changing the default (even with kernel warnings) seems like 
  it risks people generating keys from an unseeded prng, and that seems 
  like a bad thing? …

  In one case we have "Systems don't boot, but you can downgrade your 
  kernel" and in the other case we have "Your cryptographic keys are weak 
  and you have no way of knowing unless you read dmesg", and I think 
  causing boot problems is the better outcome here.
The getrandom(2) saga was a long thread of Linus repeatedly ignoring the knowledge and concerns of the his own security deputies.

> In the end, the solution at which the community arrived is best compromise given the constraints.

Yes, catastrophe was avoided and in the end a happy resolution was reached — in spite of Linus, not because of him.

> I think your comment is driven more by personal hatred for a great mind rather than technical merit.

Nope.


Alas most people are incompetent when it comes to cryptography. It's another of those situations when if you don't know why something has to be done a certain way, you shouldn't be included in that decision making progress.


I agree this is opinionated. By this same token you could say that FreeBSD maintains bugs in the name of backwards compatibility and performance.

https://vez.mrsk.me/freebsd-defaults.txt


There's an assumption which this article makes which is not true; namely, that the only utilities which do (or should) use ioctl's are system programs like ipconfig which are shipped with the kernel. There are other programs, like for example Kerberos, who want to know what network addresses are used by the network, and requiring them to screen scrape ipconfig is not especially reasonable. So if you randomly change ioctls, then you randomly break other programs that might not be shipped with the kernel that need to use those ioctls. Maintaining proper backwards compatibility for userspace applications, and not labelling some set of interfaces as "can randomly change, and if you use them and you're aren't part of the privileged set of programs shipped with the kernel, well, it just sucks to be you", is not, in my opinion a professional way to run an OS.


Yes! He also mentions that NetBSD and OpenBSD have "kernel interface itself is highly stable anyways" but doesn't elaborate on how that's different from Linux's interface stability.

The argument doesn't hang together. He can't claim that changes to userland tools aren't possible whilst at the same time claiming it's because the existing interfaces never change.


It seems like for many userspace applications, the appropriate backwards compatibility ABI would have been at the SO level. ELF shlibs make it easy to version interfaces without breaking old dynamically linked programs. Linux is sort of the odd duck in that it considers the entire syscall surface a stable ABI. It leads to a lot of neat results, but can obviously also cause pain when some older interfaces are not sufficient.

That said — new ioctls could be added and new ifconfig could use those? Old ifconfig and user programs could continue using the old ABIs.


I would argue that the stable syscall interfaces is one of the main drivers behind Linux success. Containers as we know them would be quite different if they needed to somehow merge the kernel version appropriate .so files into the fs. They would forever remain just sandboxes of the essentially the same distro as the host just like they are on Solaris/FreeBSD.

Also something like WSL would be impossible


> I would argue that the stable syscall interfaces is one of the main drivers behind Linux success. Containers as we know them would be quite different if they needed to somehow merge the kernel version appropriate .so files into the fs. They would forever remain just sandboxes of the essentially the same distro as the host just like they are on Solaris/FreeBSD.

I suspect the order of cause and effect is reversed. Linux is amazingly successful and as a result containers were shaped to fit the capabilities and needs of Linux. The reason Linux has a strong syscall ABI is because it doesn't have a userspace. Glibc is only loosely related to the kernel and cannot be the kernel's ABI compatibility layer, as they are wholly independent projects.

Another way of looking at this is that the overhead and additional abstraction of containers is a workaround for the deficiencies of Linux jail/zone facilities (i.e., it doesn't have them; you have to carefully piece together a secure sandbox out of various cgroup and namespace components).

> Also something like WSL would be impossible

Some interesting things about that:

1. WSLv1 was a partial Linux syscall emulator, sure.

2. Microsoft gave up on that approach and WSLv2 is just an Ubuntu VM running in HyperV. So maybe it was impossible anyway — Linux just provides a ton of system calls and it will always be difficult to faithfully implement all of them. And as some side threads have noted, Linux likes to break ABI of sysfs files all the time. But those don't count, for some reason.

3. FreeBSD has a linux syscall emulation layer (that long predates WSL) using essentially the same premise as WSLv1. You're correct that this style of implementation (syscall ABI) only works due to Linux's syscall ABI choices.

4. However, something like WSL is not impossible against a shlib ABI. The canonical example here is WINE. WINE implements (much of) the Windows NT shared library (DLL) level stable ABI. The same could be done for other systems that provide ABI stability at the DLL level, such as MacOS or FreeBSD.


(Is that you Ted?)

I kind of said the same thing just now:

https://news.ycombinator.com/item?id=22629837


But, for finding interfaces and read about their state and settings, aren't /proc and /sys the best way to go, if there is no worry about toctou? I was answered yesterday on another topic that /proc is a linuxism, but is there anything missing there ?


/proc is originally a Bell Labs and Plan9ism.


A long, overlapping deprecation period is the typical way of handling this. Why is this avoided at the lower levels?


There's no reason why someone can't get their act together and fix ifconfig, or if that is insurmountable, rewrite it while maintaining the same outward appearances.

I think this is not specific to Linux at all, and is not really about kernel ABI. This is a problem I see a lot, often in proprietary software too. Firstly, people opt to rewrite before they fix or even before they fully understand the old thing. Secondly, people confuse interface with implementation, and often don't bother trying to make their rewrites interoperable with the old thing.

Another example of that last one, in Linux from the late 90s, is ALSA. FreeBSD for example chose to fix the problems and maintain the old OSS API for audio devices. Linux introduced a very complicated new API.

[By the way, I spotted one sort of technical error in the article. I don't think most software keeps private copies of kernel data structures in their source tree. Rather, when they #include the kernel headers, they end up using an effectively-private version of those structures and constants in their object code. This is slightly different but a similar result.]


> There's no reason why someone can't get their act together and fix ifconfig, or if that is insurmountable, rewrite it while maintaining the same outward appearances.

But it's telling that nobody stepped up and actually bothered to do that despite all the bitching and moaning.

Successful code is generally produced to solve problems, not fullfil some abstract ideals


Except someone did rewrite it:

https://news.ycombinator.com/item?id=22628938

But with the Linux ecosystem's inertia being what it is, rather than the 2 existing implementations of ifconfig in use being displaced by this better one... now there are just 3 implementations of ifconfig in use.

As mentioned in the Stack Exchange thread linked in that post, a different someone submitted a relevant patch to one of those existing implementations – way back in 2013. But it was never accepted. More inertia.


To be fair, the third 'ifconfig' implementation is not compatible with the original in command-line options or in output, so adopting it would require changes just as adopting 'ip' does.


Two priors ("original" not really being the right word), each already not compatible with the other.

* http://jdebp.uk./Softwares/nosh/guide/commands/ifconfig.xml#...


2009, in fact. I put it in the further reading.


some one in this very thread said did in fact do exactly that, since they needed compatibility with FreeBSD.


Software using obscure kernel interfaces does keep private copies of kernel data structures.

The reason is that people want the software to have full functionality no matter what system is used for building it. You can build on an old system, with a kernel that totally lacks the functionality, and produce a binary that can use the new kernel interfaces when run on a system that has them.


Glibc certainly does this, I would expect for limited periods per interface. It is often at the cutting edge of kernel interfaces, so that becomes necessary during interim periods. I really doubt it makes sense for old and stable things like ifconfig, or a tipical application.


> Glibc certainly does this

> It is often at the cutting edge of kernel interfaces

Is it? It seems like support for Linux-specific features can take years. The getrandom system call, for example:

https://lwn.net/Articles/711013/


> or even before they fully understand the old thing.

I feel that /sbin/ip is in almost every way a better tool than /sbin/ifconfig, is that not the concensus?

> Linux introduced a very complicated new API.

Which highlights the fact that the moment you want to have in-depth interactions with a driver for a piece of hardware ioctls are the wrong choice.

OSS was great if you had exceptionally simple needs.. the moment you want to do "professional" audio it was a nightmare. ALSA is _somewhat_ better for this, and I honestly wish they went with a more netlink(3) style API instead of an IOCTL driven one.

What's the point of pantomiming the old thing when the old thing wasn't designed properly in the first place?


Ifconfig wins on the following

1) name. It’s to configure the interface

2) default format. Far better spaced than “ip”’s various outputs. I find “ip -o a” just about manageable, but if config shows data far more cleanly imo


My opinion is that `ip` is a more powerful tool, but it is not as useful for quickly obtaining information or performing a basic action like up/down.


My opinion is that while a little more verbose it is just as useful for obtaining information or performing basic actions. Especially since it allows you to use the same cli and actions to maintain most network related actions rather than multiple tools with varying commands.


And my opinion is that ip is easier to use in almost every way than ifconfig. The man page used to be bad but the tool has always been easy to use as long as I remember.


I really do not like the default UI in ip. You have to remember a lot of weird stuff to get useful information. The default listing is compressed, hard to read, and noisy. Then I saw someone who had these aliases:

    alias ipa 'ip -br -color a'
    alias ipl 'ip -br -color link'
Night and day! I try to not use the aliases because I'm often on servers without them, but the options for brief and color with the command to list adapters or links at the end it one all sys admins should memorize. It returns output that isn't terrible.


Nice hint, thanks :)

According to the manpage, -color can be shortened to -c and link shortened to `l`, as `addr` can be shortened to `a`. -br stands for -brief, which knowing makes it easier to remember.

  ip -br -c a
  ip -br -c l
Now I need the aliases less.


Thanks for sharing. That is very useful.

In case somebody is wondering the output looks like:

    $ ip -br -color a
    eth0             x.x.x.x/16 xxxx::xxxx:xxxx:xxxx:xxxx/64
    lo               127.0.0.1/8 ::1/128
    wifi0            x.x.x.x/24 xxxx::xxxx:xxxx:xxxx:xxxx/64
    wifi1            x.x.x.x/16 xxxx::xxxx:xxxx:xxxx:xxxx/64
    wifi2            x.x.x.x/16 xxxx::xxxx:xxxx:xxxx:xxxx/64

    $ ip -br -color link
    eth0             xx:xx:xx:xx:xx:xx <>
    lo               00:00:00:00:00:00 <LOOPBACK,UP>
    wifi0            xx:xx:xx:xx:xx:xx <BROADCAST,MULTICAST,UP>
    wifi1            xx:xx:xx:xx:xx:xx <>
    wifi2            xx:xx:xx:xx:xx:xx <>


An alternative:

    alias ip='ip -brief -color'
which propogates the options to everything. Then if you need the full output, just call ip without the alias:

    $ \ip a
    $ \ip l


> The only to prevent this would be to conduct regular massively coordinated updates to system utilities when the kernel changes, and properly version applications for specific kernel releases.

Well, no. What you do is create a new struct with correct behavior and let new applications use that. The old struct becomes deprecated but continues to exist until things stop using it. Which may be indefinitely, but that's not really a lot more work than maintaining a compatibility library to serve the same purpose.

The thing with ip vs. ifconfig was more to do with there being somebody interested in creating ip and nobody interested in maintaining ifconfig. If you want to go add support for multiple addresses on the same interface to ifconfig, nobody is stopping you -- in fact that has already happened now.


I've been actively working on firmware that is now over ten years old. My advice tends to be if you have an api that takes a pointer to a structure you should have a version field from the get go.


Indeed, ip is vastly superior. But it took me almost a year of working with internet gateways (i.e as a day job) to let old habits die. Learning the new output format takes time.

I've almost replaced route with "ip route" as well. arp isn't fully replaced yet in muscle memory.

Also, as a comment on the article, ip can now fully replace brctl. (and iw replaces iwconfig and iwlist).


IP may be better from an ABI standpoint, but I still find its output to be much harder to parse with the Mk 1 Eyeball than ifconfig. Most of that is probably just ip shoving more information on the screen than ifconfig, but most of the information isn't useful 99% of the time and maybe it would be nicer if ip pared down some of the "qdisc noqueue state UNKNOWN group default" noise unless someone adds a "verbose" parameter?

Or maybe only show them in the basic use case when they are set to something other than the default?

All of the "valid_lft" lines add a lot of visual noise.


If you try, you’ll hit the “can’t change it now, too many people rely on the output in scripts!” noise.

Really, just some whitespace in between the interfaces like in ifconfig would go a long way.


While I agree that different defaults might be nice, modern versions of ip support a -br[ief] option which is pretty nice:

  $ ip -br a
  lo               UNKNOWN        127.0.0.1/8 ::1/128
  eth0             UP             10.123.45.67/24 fe80::7ed3:aff:fe51:4612/64
  $ ip -br l
  lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
  eth0             UP             7c:d3:0a:51:46:12 <BROADCAST,MULTICAST,UP,LOWER_UP>


'ip -br a' get real messy if you have IP6 unfortunately. IPv6 hosts like to make lots and lots of addresses, which all get smooshed together on a single line of hard-on-the-eyeballs hex noise.


Well yes but come on daddyo (I'm a granddad btw), who still routinely uses an 80 column terminal these days 8) Besides, you can always feed the output to awk for an easier on the eyes report. I have three IPv6 including the link local address on my wlp2s0 and they all fit on one line with the IPv4.

I've just run up the manual and the kool kids could be using 'ip --json --pretty address' or 'ip -j a | jq'


Adding 'scope global' to 'ip a' cuts down on some of the usually uninteresting addresses. More typing though.


This - you think - new tool will be better and then are blinded :) If they would just move the noise down one line so you could see interface name without BROADCAST, MULTICAST, mtu 1500 qdisc up blah blah.


The fact that the Linux ABI is stable is the whole reason why containers are possible.

If it were necessary to use the matching version of userland tools to your kernel version, then a container engine would only be able to run container images that match the kernel version -- you wouldn't be able to run containers based on two different versions of Debian, for example.


I don't buy it. Yes, Linux never breaks userland. Yes, ifconfig is deprecated. But I still don't see how these are related.


I still don't understand how this "Linux never breaks userland" thing holds up, it's been proven false.

The 4.13 kernel introduced a change to effectively namespace sysctl settings (prevent the change from default from being copied to the host system into container processes). This broke any container runtimes that relied on host-wide sysctl settings.

[1] https://github.com/aws/amazon-ecs-agent/issues/789

[2] https://success.docker.com/article/ipvs-connection-timeout-i...


I think the real guarantee is more like "linux doesn't break the syscall API" (the API, not the ABI). I think sysctl is not really part of it.


How do you think sysctls are implemented?

In fact, prior to Linux 5.5, Linux had a direct sysctl syscall and removed it — which, uh, is clearly an ABI break.

With Linux's sysctl pseudo-fs model, you can argue the actual structure and behavior is just some aspect of sysfs, and the open/read/write syscalls are obviously not broken, but I think that's pretty simplistic. sysctls (and sysfs) are provided by the kernel.

Linux doesn't break ABIs that Linus judges to be worth more stable than rototilled. That's all. Usually Linus swings conservative on this.


I use systemd-nspawn. It works on a 3.10.87 kernel on mips64.

That's pretty impressive.


Here's an example from a decade ago (ifconfig was deprecated then too):

- bonded interfaces

- with vlans

- with multiple IPs

If this sounds unrealistic, it was the standard config for index arbitrage at an investment bank.

The interfaces, added with distro scripts, will use `ip` and all the interfaces will show up in `ip addr` and work perfectly. ifconfig won't show about half of them - they won't be up, or down, they just won't appear.


I set up systems in 2009+ that had multiple IPs per phy, bonded, vlan, etc. on linux. I used the config files within the distro (mostly CentOS/RH then) to do this. Worked fine. I developed somewhat better inspection scripting to augment this. My users never had problems with this.

I think the distro tooling used ip under the covers. I didn't care one way or the other. Nor did the users.


Yes, that's correct, all distros use ip and have for about 15 years.


I don’t doubt that you’ve had problems but e.g. ‘ifconfig bond0.143’ references an ethernet interface that represents vlan 143 on the bonded interface bond0. This sort of setup has worked for at least a decade on Linux for me, probably longer.


That's interesting. ifconfig bond0.143:0 would be the vlan? Either way I can definitely tell you the interfaces were missing from `ifconfig` but visible in `ip`.


Linux's stable kernel ABI makes tools like rr and strace much easier to maintain. It sounds like maintaining rr for FreeBSD would be harder since you'd have to keep updating its syscall model for changes to existing syscalls as well as new syscalls. And that's if we only supported the latest FreeBSD in each rr version; supporting multiple versions of FreeBSD would require a lot more code.


> On FreeBSD this is not a problem. They update the kernel and userland tools in tandem.

Yes it is a problem, to some programs doing, say, nice FFI to get at that ioctl and structure cleanly instead of forking and piping some utilities and text-processing their output.

The concept of an ABI, and one that is stable, is very real and important.


Interesting backstory, but the article doesn't actually provide a reason. It just states ifconfig has a bug, and "I do not fully understand they did not just update ifconfig(8)".


Huh?!?

I use multiple addresses a lot, and recall having used them through ifconfig for years, probably well over a decade. Anyway, at least on Debian, the package net-tools brings ifconfig it back, again with full multiple addresses support.

  # uname -v
  #1 SMP Debian 5.4.19-1 (2020-02-13)


  # ifconfig
  enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.10.42  netmask 255.255.255.0  broadcast 192.168.10.255
        inet6 fe80::2e4d:54ff:fed8:7b59  prefixlen 64  scopeid 0x20<link>
        ether 2c:4d:54:d8:7b:59  txqueuelen 1000  (Ethernet)
        RX packets 1162942  bytes 1627138668 (1.5 GiB)
        RX errors 0  dropped 42049  overruns 0  frame 0
        TX packets 546159  bytes 41181499 (39.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

  lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 16  bytes 818 (818.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16  bytes 818 (818.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


  # ifconfig enp2s0:1 192.168.1.42

  # ifconfig
  enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.10.42  netmask 255.255.255.0  broadcast 192.168.10.255
        inet6 fe80::2e4d:54ff:fed8:7b59  prefixlen 64  scopeid 0x20<link>
        ether 2c:4d:54:d8:7b:59  txqueuelen 1000  (Ethernet)
        RX packets 1163011  bytes 1627147417 (1.5 GiB)
        RX errors 0  dropped 42089  overruns 0  frame 0
        TX packets 546185  bytes 41184061 (39.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

  enp2s0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.42  netmask 255.255.255.0  broadcast 192.168.1.255
        ether 2c:4d:54:d8:7b:59  txqueuelen 1000  (Ethernet)

  lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 16  bytes 818 (818.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16  bytes 818 (818.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


    ifconfig enp2s0:1 192.168.1.42
Now try

    ip addr add dev enp2s0 192.168.42.1/24
and observe how ifconfig ignores the second IP on the same interface. enp2s0:1 is an alias to enp2s0, not enp2s0 itself. To replicate your ifconfig command, you'd use

    ip addr add dev enp2s0 192.168.1.42/32 label enp2s0:1
and then it shows up in ifconfig as well.


You're right, it's not exactly the same thing and both approaches (with an alias and without) have their uses. ifconfig forcing you to come up with a unique label can be annoying when you want some script to add a new address to an interface without having to go handle the error case of "but what if <iface>:1 is already used?".

Meanwhile having an explicit label can be nice in other situations if you want to replace an address instead of merely adding a new one.

I also seem to recall running into issues dealing with IPv6 with ifconfig that I worked around by switching to ip. I don't remember the specifics however.

Overall I wish ifconfig had been updated, I still find it a lot more user friendly that iproute. In particular the default output of a plain "ifconfig" is vastly more readable IMO: https://svkt.org/~simias/up/20200319-165513_ip.png

In general these days I use ip in scripts and ifconfig (when available) interactively. I don't love having to remember two ways of doing the same thing but I can't rely on ifconfig being available in scripts anymore and I just waste too much time parsing ip's bad output when I'm messing with networking interactively.


Understood, thanks, I'm so used to ifconfig that always tried to avoid using ip. Time to learn it I guess.


I spend a lot of times dealing with new devices. Plug it into the same VLAN as my desktop and then in a term:

# ip a a 192.168.1.32/24 dev eth0

Then put 192.168.1.1 into browser etc. Do job. Back to term. Up arrow to get last command, change the second a to d (for delete) and the extra address is gone.


The Linux approach to system call stability is awful. I've written about it extensively before. See this thread [1]. The gist of it is that a system doesn't need to place the ABI support boundary at exactly the same page as the ring3-to-ring0 boundary, and that Linux does is 1) basically shipping the org chart, and 2) harmful in all sorts of ways.

The right way to do OS ABI stability is to require that system calls go through some user-space library before executing a privilege transition. This way, the kernel's interface can change as needed without breaking existing userland binaries. Windows gets this right: system calls go through ntdll.dll. Fuchsia gets this right as well: they don't have an ntdll.dll, but they have a huge vdso that amounts to the same thing.

If I had my way, Linux would take the Fuchsia approach: require every system call go through a vdso.

"But golang!" and "but I have multiple libcs!" and "but my static linking!" are not valid excuses. Windows also has Go and it also have multiple libcs and the NT kernel isn't forced to maintain decades of userland-facing crud.

As for static linking: you can statically link the world if you'd like, but you can do that and still call through a vdso or load the platform libc.

[1] https://news.ycombinator.com/item?id=20854857


How exactly does a ntdll/vdso approach make it easier to preserve binary compatibility? It obviously moves the place at which binary compatibility needs to be preserved, but I'm not seeing how that move makes maintenance easier, unless you're implying that the userspace portion would not be the responsibility of the kernel developers.


User space has a lot more flexibility than the kernel when it comes to implementing ABIs. For example, Linux has to maintain a relatively large table of system calls for legacy reasons. (Consider socketcall(2)). Legacy "system calls" provided by a VDSO or ntdll.dll-like mechanism are just tiny regular functions with no special security or performance implications.


>Legacy "system calls" provided by a VDSO or ntdll.dll-like mechanism are just tiny regular functions with no special security or performance implications.

Are you saying that wrapper code implementing an old interface by wrapping around a new interface is easier to do correctly in userspace than in the kernel itself?

I think I need you to be more concrete in pointing to a specific performance penalty or avoidable maintenance burden.


> Are you saying that wrapper code implementing an old interface by wrapping around a new interface is easier to do correctly in userspace than in the kernel itself?

Yes.

> I think I need you to be more concrete in pointing to a specific performance penalty or avoidable maintenance burden.

I already did: socketcall. And NT does all 32-bit compatibility in userspace.


> I already did: socketcall

Got anything that applies to modern platforms or recent kernel versions?

Also, does socketcall actually have higher overhead than doing a similar dispatch in userspace, or does it just mean you get a cache miss slightly later in the process?

(You could try elucidating an actual argument rather than just naming a syscall and assuming that the rest of your argument is obvious.)

> And NT does all 32-bit compatibility in userspace.

So? Is it faster to do that in userspace than in the kernel?


Doing things in userspace prevents a context switch, presumably.


Not when it's just prep work before doing an unavoidable syscall.


Another place where ifconfig doesn't work properly is that it assumes network interfaces have 48 bit hardware addresses. Works fine on ethernet but not on infiniband.


So, if a kernel bug happens in some interface, the only way of not being affected by that bug is actually deciding to not use that interface. Bugs become features, and "do not ever break userspace" causes cruft to accumulate.


Somewhere, the OS had to provide a stable interface. Why does it matter if it's in userspace vs the kernel? If you have to break the interface, it seems like applications have to be adjusted and/or recompiled, no?


It matters if userspace and the kernel are developed by separate teams, as is the case with Linux. If you break the kernel interface but have a library with a stable interface then you can't support static linking (which fell out of favor for a decade but is now back in style).


Right, and the article seems to have some sort of axe to grind with Linux, so the author paints "developed by separate teams" as somehow an inferior choice deserving of scorn.

I generally found the article to be light on technical accuracy and heavy on pointless "my OS is better than yours" prattle.


I don't think this is accurate. Other people may know better, but from my understanding the maintainers of ifconfig realized that to update it to use the new behavior, they would have to end up changing the output. The output of ifconfig is used by a lot of scripts, so changing the output would break a million users, so they decided to write a new tool, rather than add a flag or something to opt into new behavior.

So it's userspace compatibility, not the linux kernel compatibility that lead to a new tool.

(and linux will drop old interfaces if it can be proven that no one uses them, or if they're insecure)


The indirection allows for versioned APIs and compatibility shims, along with lengthening the migration window while not tying the kernel down and keeping it from fixing misdesigns or changing circumstances.


> We should look at the whole project [Linux] as a cautionary tale of the kind of leveraged destruction that some programmers of modest ability but extreme confidence can wreak on our industry.

https://news.ycombinator.com/item?id=21168895


(Generalized cliche) There are two kinds of systems, ones that everyone complains about, and ones that no one uses.

I'm also reminded of the "expert beginner" series of blog posts about jealously guarded fiefdoms and the folks that dominate them.


The people replying to that comment make a much better argument against the point it was trying to make that it does for it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: