Some of us are very concerned about licensing at all times. I'd wager such steadfast concern is ultimately what helped Linux in matters of SCO v. IBM over the matters of IBM's JFS file system.
ZFS is very much in a grey area, and in turn, a huge turn off for many.
And if so, what about all those proprietary GPU-driver kernel-modules from AMD and Nvidia which definitely are not compatible with the GPL? Are they a gray area too?
And if they're not a gray area... Why is it suddenly a problem that the ZFS kernel-module is not licensed in a GPL-compatible manner?
I'm not trying to sound facetious, but I honestly can't see the distinction here. And Linux distros left and right have been distributing closed-source kernel-modules for GPUs for a long time now. So what's the problem? What am I missing?
Yes, the proprietary drivers have always been an ethical and legal grey area. However, the way a proprietary driver is distributed is different to how ZFS is distributed. Proprietary drivers are distributed as object code that is then built to be a kernel module on the user's machine. This means that at no point does Nvidia or AMD distribute a Linux kernel module with proprietary code. There are arguments however that distributions which distribute this auto-build scheme by default may also be in violation of the GPL. ZFS is distributed by Canonical as a fully functional kernel module.
There's a lot of gray areas because there have not been many legal cases on derived works and the GPL. The GPL itself has held up in court on copyleft grounds, of course.
Also, you've got the fact that in the case of proprietary graphics drivers, the threat is that the Linux kernel community would sue Nvidia or AMD. The threat with ZFS is that Oracle (who is a member of the Linux kernel community) would sue people using ZFS (they could also then sue for patent infringement).
I know in the past NixOS did the same for the ZFS kernel module: prior to NixOS installation, it was able to download the ZFS source code, build it and then load it from inside the installer Live CD/USB while it was running, and this would require just a couple of commands. I don't know if this is still true nowadays or if they just distribute the ZFS kernel module directly in the Live installer.
If combining gpl and cddl in a redistributed file is a violation of gpl but not cddl, then what would the holder of the copyright of the cddl code be able to sue about? I think the concern is that a contributor to Linux might sue, just like in the Nvidia/and case.
Oracle is a contributor to Linux, so they could sue from the GPL side. While Nvidia and AMD also are Linux contributors, the fact they ship proprietary modules would make it hard to argue that they aren't implicitly permitting users to redistribute it (and thus they would be forced to license their drivers under GPLv2).
Not to mention that ZFS is covered by Oracle patents. CDDL provides a patent license, but it might be possible for Oracle to sue you for patent infringement if you're distributing code in a way that complicates the licensing. Not that I'm saying that's likely, but Oracle has enough money to ruin you if they want to.
The crucial difference is in distribution, not in usage.
AMD and Nvidia distribute a binary module with a thin shim. It is the user who builds this shim and inserts into his kernel. Neither AMD, Nvidia nor any Linux vendor[1] distribute any binary, that links into GPL-ed kernel. The combination is done by the user.
And so does the ZoL project. They have the same trade-offs as AMD and Nvidia. It is just much more difficult to install Linux on a filesystem not supported by the installer, than not have working 3D acceleration at setup time though.
[1] the distributions take care to have them in separate, third party repos and if you look more closely, you will find they are being build on the user machine using dkms, akmod or similar mechanism. That also trows a wrench into things if you want to use Secure Boot too.
Publish the source code files only under GPL and Linux kernel developers will be happy.
You might get intro trouble with the ZFS side however, since the CDDL license says: "must also be made available in Source Code form and that Source Code form must be distributed only under the terms of this License".
If the Source Code form is only under the terms of the GPL, then the above condition is not met. The law suit would thus come from the ZFS side.
This is resolvable if you think of a driver "shim" that's a derivative work of both a GPL'd codebase (e.g. Linux) and a non-GPL'd codebase (GPU drivers). It's possible for the GPU driver's proprietary license to permit linking without license restrictions, meaning that a "shim" kernel module which loads the binary blob can be under the kernel's copyleft license without problem (since while it's a derivative work of both the module and the blob, only one license is copyleft).
(I don't know if this is what AMD/Nvidia actually do ToS-wise, but it's at least feasible).
Doing this for ZFS is harder, since the shim would have to be a derivative work of both a GPL'd codebase and a CDDL'd codebase, neither licenses of which are compatible with one another. Dual-licensing is probably illegal, as is just picking one over the other.
IANAL, though, and I might be grossly misunderstanding the problem.
Using a shim to do a legal workaround seems to me as a trick that is unlikely to ever get past a judge. In all other form which people try to do legal workaround in order to turn illegal to legal, it seems that the law don't generally take kindly to it and either explicitly forbids it or just treat the trick as inconsequential to the end result.
For example, we have shims in form of dummy organizations, straw owners, money launderer and so on. Neither of those kind of shims work to turn something illegal to legal. Why would some dummy code that sits between two incompatible software work any better in the legal system?
"Neither of those kind of shims work to turn something illegal to legal."
There's nothing illegal here to try to turn legal, though. If the blob's license permits other software to interface with it without restriction, then the shim is free to be under the GPL per the terms of the other work from which it's derived (Linux).
The only way there'd be anything illegal here would be if the blob itself is a derivative work of Linux, and if that's the case, then the blob itself is illegal (since it would be subject to the GPL), regardless of the shim and regardless of whether or not it's distributed as part of a Linux distribution.
Take a dummy organization that is used to avoid taxes. There is nothing illegal to pay someone to form a company located in a different country. A company can also sell assets to an other for what ever price they want, and there is nothing illegal to declare that as a result one of the companies now have zero profits and zero assets.
But if something is legal or illegal depend more than just on each set piece. If combining a blob with the kernel creates a derivative, then just adding a shim to the mix won't turn it legal. Judges are trained to look at the big picture rather than the individual parts, and this is especially true in civil-law. Court systems are generally interested in:
What is the kernel authors intention by their copyright license.
What is the blob authors intention by their copyright license.
What is the distributor intention when combining the two works and what understanding did the distributor have about the other copyright authors and their wishes in regard to derivatives.
The existence of a shim that has no other purpose is a sign that something shady is going on, similar to dummy organizations. The question a judge is likely to ask is why such obfuscation was used since that can help establishing intention. If the end result is a de-facto derivative, and the intention was to create a single work out of two seperate works, and the creator knows that they are not allowed to create a derivative work, then a shim is not going to save the day.
"Take a dummy organization that is used to avoid taxes."
See, that right there is where the analogy falls apart. Tax evasion is (usually) illegal. Writing software to translate between two independently-developed programs is not, last I checked.
"If the end result is a de-facto derivative"
You'd need to prove the blob is derived from Linux. If it's not, then there's literally nothing illegal happening here. If it is, then - again - the shim is not the illegal thing; the blob is, and the shim is entirely irrelevant in that illegality.
The usual reason for the shims, by the way, has very little to do with licensing terms (at least not directly) and very much to do with the fact that Linux does not provide a stable API (let alone ABI) for kernel modules. The intent of the shim is therefore almost always technical rather than legal in nature.
>ZFS is very much in a grey area, and in turn, a huge turn off for many.
There is a grey area in distribution, there isn't one about using ZFS. Even the FSF, the group who believe the CDDL and GPL can't be distributed together, say "Privately, You Can Do As You Like."
Who deploys production systems on a large scale with a "private" build of the filesystem code? I want my production systems to run on code that is being used by as many people as possible; I don't want patched kernels, I don't want privately built kernel packages, I don't want a unique system that only I've ever seen. I want a system that is as boring as possible (while still providing the functionality I need to effectively do my job). I want a system with a bunch of people complaining about it and asking questions about it online, so that when problems arise, I can find answers.
Now that ZFS is available on Ubuntu and seems to have some adoption, I guess it's a reasonable choice for some. I'm still a bit iffy on it. I don't really want to add license concerns to my list of worries.
The CDDL also has patent clauses and so it's conceivable that a user of OpenZFS which received it in a way that violates the OpenZFS license could be liable for patent infringement of an Oracle patent. And there have been many cases of companies suing users of software over patents.
Another issue is that you should always get software like your filesystem from your distribution. We do a lot of work making sure that your systems can be safely updated, and making sure that upstream bugs are fixed for our distribution. Even community distributions put a lot of effort into that work. As someone who works on maintaining a distribution (I work for SUSE and contribute to openSUSE), I would guess that most people underestimate how much work you need to devote to maintaining the software in a distribution.
>The CDDL also has patent clauses and so it's conceivable that a user of OpenZFS which received it in a way that violates the OpenZFS license could be liable for patent infringement of an Oracle patent. And there have been many cases of companies suing users of software over patents.
Again, with usage this isn't a problem, the license could only possibly be broken by distribution with the GPL. Even then, I believe it is the GPL that is broken, so the patent clause would remain.
As for the rest of your argument, the OpenZFS team does a lot of work maintaining the filesystem. Why does that work need to come from you?
> As for the rest of your argument, the OpenZFS team does a lot of work maintaining the filesystem. Why does that work need to come from you?
Integration into our tools, backporting fixes, doing release engineering, tracking upstream changes, triaging and resolving distribution bug reports, documenting usage and troubleshooting, configuring defaults and best practices, a whole lot of testing, etc.
As I said, there's a lot of work that goes into a distribution (I probably haven't covered most of it) that most people don't think about. And that's assuming that a distribution is going to be passive about something as core as a filesystem -- which we wouldn't be. So we'd be working with upstream on development as well, which is more work. So saying something like "it's supported on distribution X" when that distribution doesn't even provide official packages for it is a massive stretch. It might work on distribution X, and you might provide independent ISV-style support for it, but it's not supported by us.
I appreciate that the sort of work distributions do isn't well-publicised (mostly because stability is hardly a sexy thing to blog about, and we don't rewrite things in JavaScript every weekend). But there is an incredible amount of work that goes into making distributions work well for users, and there's a reason that many distributions have lasted for so many years (there's a need for someone to do the boring work of packaging for you).
If I know that Canonical can't legally distribute ZFS in whatever format to me, and yet I use Canonical's distribution of ZFS, isn't there a legal risk there? After all, it would turn out that I have no license to use said distribution of ZFS as such a license was never conferred to me by someone with the legal right to do so.
Generally speaking, courts would probably give me the benefit of the doubt if I had no reason to believe that they couldn't distribute it to me - but as I knew they couldn't (the issues with ZFS and the Linux kernel are well-documented), and I knew I'm using it, they'd probably hold me in violation of copyright.
>If I know that Canonical can't legally distribute ZFS in whatever format to me, and yet I use Canonical's distribution of ZFS, isn't there a legal risk there?
If you somehow knew that any form of distribution was illegal that would be the case. I haven't heard anyone saying that's the case, distributing it bundled with GPL software is potentially breaks the GPL.
Well, two things. He didn't say that, he said it's been PORTED to Linux for quite some time. Which is accurate, the first "stable" release was in 2013 - 4 years qualifies as "quite some time" in pretty much any tech circle that isn't VMS.
As for it not being widely available... you're saying the only way for something to be considered widely available is to be included in the distro directly? I'd argue an easy installation/setup and solid documentation are FAR more important than being included in a distro. If the setup is arcane or the documentation is horrible, it doesn't matter if a tool is in every distro on the planet, nobody is going to use it.
I've been using ZFS on Linux for 5+ years. I know a lot of people wrote off zfs+fuse, but I used it very successfully to store 10s of TB of backup data for years with no data loss events and performance I can't tell was worse than ZFSonLinux. And I've been using ZoL for years at another job.
It has "official support" from the ZoL folks. And yes, openSUSE has ZFS packages in OBS. But we sure as hell don't ship them by default, or in our official repos. The same applies for Arch, Debian, Fedora, Gentoo, RHEL and CentOS.
Ubuntu is the only distribution that official supports ZoL, and actually ships it in it's official repositories (and by default). What that means is that Canonical is effectively saying "we trust there's no legal reason why we cannot do this." No other distribution has made that claim.
EDIT: Actually NixOS also supports it, but the point stands.
They're referring to ZoL providing support for distributions (which actually just means "it works, and if you send a bug we'll work on it"). Only Canonical provides support from the distribution side. See https://news.ycombinator.com/item?id=15088761.
BTRFS has much more flexible snapshots and clones than ZFS. You can create rewritable snapshots and create new snapshots based on those. In addition you can create COW copies of files with "cp --reflink" which ZFS doesn't support.
BTRFS feels much more like a native filesystem on Linux too. It doesn't have that big ARC cache.
We've been using BTRFS for 5+ yrs on a Linux MD RAID setup, with no problems at all.
> You can create rewritable snapshots and create new snapshots based on those.
You can do this with ZFS as well.
First, there is no such thing as a "rewritable snapshot"; the term is an oxymoron. Please do not use it and please do not promulgate it. The term "snapshot" should be reserved for read-only states of a (file) system at a particular point of time:
As long as the clone exists though, the snapshot cannot be deleted. However you can "promote" the clone, after which the snapshot can be removed. This feature has been around since at least 2010:
I was aware of the promotion feature. I was creating new clones/snapshots in a chain hierarchy in zfs, copying old backup sets progressively onto the back of the chain, but keeping the head as the current copy. This was a breeze in btrfs, but basically impossible in zfs as it refused to promote the old clones/snapshots.
As to the nomenclature, it doesn't seem to make sense to differentiate snapshots and clones. With the flexibility of btrfs they're just the same thing with a R/O flag.
> As to the nomenclature, it doesn't seem to make sense to differentiate snapshots and clones. With the flexibility of btrfs they're just the same thing with a R/O flag.
It does make sense: one is writable and the other is not. When someone is talking about (e.g.) mitigation mechanisms against ransomware, saying you have "snapshots" is meaningless if they're R/W as the ransowmare can go in an overwrite files. But if you use the term "snapshot" correctly--meaning R/O--everyone involved knows you have mitigated the risk since the data is safe from being altered and reverting is possible.
It's not the "same thing" if there is a difference between the two--which there is, the R/O flag setting. If two thing are different then they are not the "same": this may seem tautological but it's not. Call different things differently.
The btrfs CLI is really retarded in this regard where using "snapshot" is not R/O by default, as it violates decades of expectations and POLA:
The term "clone" has been used in the storage industry for several decades now. It has a very specific meaning, as does the term "snapshot". I understand it might not make sense to you personally, but you're just going to confuse anyone you're talking to referring to it by another name. You might as well start calling containers "light virtual machines". You'll get an equal number of blank stares and confusion from whomever you're talking to.
OpenSolaris is effectively dead, but its successor is the illumos project -- which has multiple distributions such as SmartOS. So you could use SmartOS+ZFS.
I love ZFS, but I'll also say that I also feel like ZFS is fairly slow. Of course, it probably doesn't help that most of my ZFS machines hold 10+TB of data, with lots of snapshots.
I don't feel like I'm going to be using btrfs anytime soon, I've given up on it. But there are days I wish I had an alternative for high reliability, snapshotting, and ideally deduplication (which usually tends to make my ZFS machines fall over).
Synology DSM is based on their own custom Linux. Nothing's changed there recently, except that with 6.x released last year, btrfs became the default file system (used to be ext4).