A data-point of one, and I only have a few TB spread over a handful of disks, but I lost an entire 1TB btrfs volume recently because of a single disk sector gone bad (of course, I have backups, but it's a pain to restore 1TB, not to mention racking of the nerves (will the backup drive fail with sudden heavy use?)
It's the only FS I've ever been unable to recover data from, at all, (I've had quite a few disks with errors before going back 25 years), and so it's no longer on my list of filesystems-I-trust.
I've experienced data loss twice in 15 years of using Linux and trying different file systems. Both were on btrfs. I didn't use any advanced features, though I had plans to before this happened. But at that point, I cut my losses and converted all my disks back to ext4.
The features are great, but data availability/integrity is the fundamental thing I want from a filesystem. If I write some bytes, they should be there tomorrow. Everything else is secondary.
If you're interested in a more featureful filesystem that isn't btrfs and has a very solid track record, I would recommend looking at XFS. It's a very old filesystem but has a lot of quite modern features (and it performs better than ext4 for several workloads).
Funnily enough, XFS is the only file system on Linux where I've lost data. Back in the day, the wisdom was to stay away from it if you were in an iffy power situation because it would serve zeroes if there was a write to a file near a power loss (I.e. you wouldn't get old file or new file but something else)
Having had that happen to me I always used some extN and didn't lose any existing files.
Of course that's a decade or more ago and I may be misremembering but a cursory Google looks like other people encountered something like it too.
Afaik the problem is that XFS trusts the hardware to do the right thing during a power loss aka stop inflight requests before the brownout turns you disks and their controllers into heavily biased random number generators. Lots of x86(-64) hardware lacks a proper power loss interrupt triggered early enough to stop the all I/O in time. The ext3 journaling hides that problem to some extend by journaling more than just the minimal metadata required.
For maybe 3-6 months a decade ago I was running XFS on my laptop. My laptop had some sort of flakiness, ISTR it was graphics related, and it would require frequent reboots for a while. I remember one particular instance where after a power cycle a file I was working on an hour prior to it was trashed. That was a real "Oh FFS!" moment and I stopped using XFS.
But, I will say I've had a lot of success using XFS to serve images, particularly tiles. Particularly map tiles that are generated once and then basically static. There it has a lot less overhead when formatted with size=2048, so lots of small files are handled better. Of course, reiserfs was even better, but part of that was with a job I worked on where they just blindly rendered map tiles for the whole earth, and so there were a lot of 50 byte solid blue tiles. Reiser did really well with that, though some deduplication or linking would have been a huge benefit.
> I would recommend looking at XFS. It's a very old filesystem but has a lot of quite modern features (and it performs better than ext4 for several workloads).
I would second this. XFS is great. A few years ago I moved from JFS to XFS for bulk data storage because JFS is pretty abandoned these days. No issues at all with XFS, and never lost any data.
Just make sure it's right for you before you use it (e.g. you cannot shrink XFS volumes as you can ext4)
XFS has the same problem as btrfs: extreme sensitivity to metadata corruption. I'm not sure XFS has the RAID1-for-metadata (DUP) feature like btrfs has recently added.
Which raid level were you using over those multiple disks? I've been using btrfs and never even experienced a hiccup despite continuous sector failures, continuous drive failures and even disk controllers disconnecting on the fly. And I've been running the whole thing on a single, fairly weak, server with second-hand disks attached over the cheapest USB3 sata adapters I've found. By all accounts I've been putting btrfs in a situation that is well outside the norm and I have 0 complaints. I also use it on my desktop (no raid) with similar experience (minus the shitty hardware).
I am continuously amazed at the amount of people who have issues with btrfs. It's been absolutely rock solid over the time I've used it and I have 0 complaints (apart from the ext4 -> btrfs converter producing a corrupted btrfs filesystem, but the actual kernel btrfs code itself has been flawless).
I too have lost an entire btrfs volume (single disk) because of a bad sector. It seems btrfs is very sensitive to bad sectors in metadata, compared to ext4.
Since then, I've formatted my btrfs partition to do RAID1 on metadata and haven't had problems since.
$ sudo btrfs fi df /
Data, single
System, DUP
Metadata, DUP
Perhaps this is the secret - and should be made the default.
I don't use raid, precisely because I'm worried that failures will take down the entire set, and all of it will be unrecoverable.
I'm not at all suggesting that BTRFS is "buggy" unreliable, just, as you say, with certain kinds of disk errors (possibly rare, but it got me), the entire volume becomes unreadable, whereas I've always been able to recover files from extX or xfs volumes.
The default for single devices is to have DUP metadata, and for raid1 devices the default is raid1 metadata. These are the defaults, somehow they got bypassed. I think you're right that you need to think a little more and know a little more about btrfs to use it properly, but I hope that will change in the future.
From the article:
'And against the demand of some partners, we are still refusing to support “Automatic Defragmentation”, “In-band Deduplication” and higher RAID levels, because the quality of these options is not where it ought to be.'
It's a relatively new feature (Jan-2016) and it's disabled by default for SSD devices (increased wear). Personally I prefer increased wear over unmountable volumes.
I have the same experience. btrfs has been great to me. I've disconnected disks on the fly as well on raid 1 and the mount is happily serving files off the remaining disks.
Some of us are very concerned about licensing at all times. I'd wager such steadfast concern is ultimately what helped Linux in matters of SCO v. IBM over the matters of IBM's JFS file system.
ZFS is very much in a grey area, and in turn, a huge turn off for many.
And if so, what about all those proprietary GPU-driver kernel-modules from AMD and Nvidia which definitely are not compatible with the GPL? Are they a gray area too?
And if they're not a gray area... Why is it suddenly a problem that the ZFS kernel-module is not licensed in a GPL-compatible manner?
I'm not trying to sound facetious, but I honestly can't see the distinction here. And Linux distros left and right have been distributing closed-source kernel-modules for GPUs for a long time now. So what's the problem? What am I missing?
Yes, the proprietary drivers have always been an ethical and legal grey area. However, the way a proprietary driver is distributed is different to how ZFS is distributed. Proprietary drivers are distributed as object code that is then built to be a kernel module on the user's machine. This means that at no point does Nvidia or AMD distribute a Linux kernel module with proprietary code. There are arguments however that distributions which distribute this auto-build scheme by default may also be in violation of the GPL. ZFS is distributed by Canonical as a fully functional kernel module.
There's a lot of gray areas because there have not been many legal cases on derived works and the GPL. The GPL itself has held up in court on copyleft grounds, of course.
Also, you've got the fact that in the case of proprietary graphics drivers, the threat is that the Linux kernel community would sue Nvidia or AMD. The threat with ZFS is that Oracle (who is a member of the Linux kernel community) would sue people using ZFS (they could also then sue for patent infringement).
I know in the past NixOS did the same for the ZFS kernel module: prior to NixOS installation, it was able to download the ZFS source code, build it and then load it from inside the installer Live CD/USB while it was running, and this would require just a couple of commands. I don't know if this is still true nowadays or if they just distribute the ZFS kernel module directly in the Live installer.
If combining gpl and cddl in a redistributed file is a violation of gpl but not cddl, then what would the holder of the copyright of the cddl code be able to sue about? I think the concern is that a contributor to Linux might sue, just like in the Nvidia/and case.
Oracle is a contributor to Linux, so they could sue from the GPL side. While Nvidia and AMD also are Linux contributors, the fact they ship proprietary modules would make it hard to argue that they aren't implicitly permitting users to redistribute it (and thus they would be forced to license their drivers under GPLv2).
Not to mention that ZFS is covered by Oracle patents. CDDL provides a patent license, but it might be possible for Oracle to sue you for patent infringement if you're distributing code in a way that complicates the licensing. Not that I'm saying that's likely, but Oracle has enough money to ruin you if they want to.
The crucial difference is in distribution, not in usage.
AMD and Nvidia distribute a binary module with a thin shim. It is the user who builds this shim and inserts into his kernel. Neither AMD, Nvidia nor any Linux vendor[1] distribute any binary, that links into GPL-ed kernel. The combination is done by the user.
And so does the ZoL project. They have the same trade-offs as AMD and Nvidia. It is just much more difficult to install Linux on a filesystem not supported by the installer, than not have working 3D acceleration at setup time though.
[1] the distributions take care to have them in separate, third party repos and if you look more closely, you will find they are being build on the user machine using dkms, akmod or similar mechanism. That also trows a wrench into things if you want to use Secure Boot too.
Publish the source code files only under GPL and Linux kernel developers will be happy.
You might get intro trouble with the ZFS side however, since the CDDL license says: "must also be made available in Source Code form and that Source Code form must be distributed only under the terms of this License".
If the Source Code form is only under the terms of the GPL, then the above condition is not met. The law suit would thus come from the ZFS side.
This is resolvable if you think of a driver "shim" that's a derivative work of both a GPL'd codebase (e.g. Linux) and a non-GPL'd codebase (GPU drivers). It's possible for the GPU driver's proprietary license to permit linking without license restrictions, meaning that a "shim" kernel module which loads the binary blob can be under the kernel's copyleft license without problem (since while it's a derivative work of both the module and the blob, only one license is copyleft).
(I don't know if this is what AMD/Nvidia actually do ToS-wise, but it's at least feasible).
Doing this for ZFS is harder, since the shim would have to be a derivative work of both a GPL'd codebase and a CDDL'd codebase, neither licenses of which are compatible with one another. Dual-licensing is probably illegal, as is just picking one over the other.
IANAL, though, and I might be grossly misunderstanding the problem.
Using a shim to do a legal workaround seems to me as a trick that is unlikely to ever get past a judge. In all other form which people try to do legal workaround in order to turn illegal to legal, it seems that the law don't generally take kindly to it and either explicitly forbids it or just treat the trick as inconsequential to the end result.
For example, we have shims in form of dummy organizations, straw owners, money launderer and so on. Neither of those kind of shims work to turn something illegal to legal. Why would some dummy code that sits between two incompatible software work any better in the legal system?
"Neither of those kind of shims work to turn something illegal to legal."
There's nothing illegal here to try to turn legal, though. If the blob's license permits other software to interface with it without restriction, then the shim is free to be under the GPL per the terms of the other work from which it's derived (Linux).
The only way there'd be anything illegal here would be if the blob itself is a derivative work of Linux, and if that's the case, then the blob itself is illegal (since it would be subject to the GPL), regardless of the shim and regardless of whether or not it's distributed as part of a Linux distribution.
Take a dummy organization that is used to avoid taxes. There is nothing illegal to pay someone to form a company located in a different country. A company can also sell assets to an other for what ever price they want, and there is nothing illegal to declare that as a result one of the companies now have zero profits and zero assets.
But if something is legal or illegal depend more than just on each set piece. If combining a blob with the kernel creates a derivative, then just adding a shim to the mix won't turn it legal. Judges are trained to look at the big picture rather than the individual parts, and this is especially true in civil-law. Court systems are generally interested in:
What is the kernel authors intention by their copyright license.
What is the blob authors intention by their copyright license.
What is the distributor intention when combining the two works and what understanding did the distributor have about the other copyright authors and their wishes in regard to derivatives.
The existence of a shim that has no other purpose is a sign that something shady is going on, similar to dummy organizations. The question a judge is likely to ask is why such obfuscation was used since that can help establishing intention. If the end result is a de-facto derivative, and the intention was to create a single work out of two seperate works, and the creator knows that they are not allowed to create a derivative work, then a shim is not going to save the day.
"Take a dummy organization that is used to avoid taxes."
See, that right there is where the analogy falls apart. Tax evasion is (usually) illegal. Writing software to translate between two independently-developed programs is not, last I checked.
"If the end result is a de-facto derivative"
You'd need to prove the blob is derived from Linux. If it's not, then there's literally nothing illegal happening here. If it is, then - again - the shim is not the illegal thing; the blob is, and the shim is entirely irrelevant in that illegality.
The usual reason for the shims, by the way, has very little to do with licensing terms (at least not directly) and very much to do with the fact that Linux does not provide a stable API (let alone ABI) for kernel modules. The intent of the shim is therefore almost always technical rather than legal in nature.
>ZFS is very much in a grey area, and in turn, a huge turn off for many.
There is a grey area in distribution, there isn't one about using ZFS. Even the FSF, the group who believe the CDDL and GPL can't be distributed together, say "Privately, You Can Do As You Like."
Who deploys production systems on a large scale with a "private" build of the filesystem code? I want my production systems to run on code that is being used by as many people as possible; I don't want patched kernels, I don't want privately built kernel packages, I don't want a unique system that only I've ever seen. I want a system that is as boring as possible (while still providing the functionality I need to effectively do my job). I want a system with a bunch of people complaining about it and asking questions about it online, so that when problems arise, I can find answers.
Now that ZFS is available on Ubuntu and seems to have some adoption, I guess it's a reasonable choice for some. I'm still a bit iffy on it. I don't really want to add license concerns to my list of worries.
The CDDL also has patent clauses and so it's conceivable that a user of OpenZFS which received it in a way that violates the OpenZFS license could be liable for patent infringement of an Oracle patent. And there have been many cases of companies suing users of software over patents.
Another issue is that you should always get software like your filesystem from your distribution. We do a lot of work making sure that your systems can be safely updated, and making sure that upstream bugs are fixed for our distribution. Even community distributions put a lot of effort into that work. As someone who works on maintaining a distribution (I work for SUSE and contribute to openSUSE), I would guess that most people underestimate how much work you need to devote to maintaining the software in a distribution.
>The CDDL also has patent clauses and so it's conceivable that a user of OpenZFS which received it in a way that violates the OpenZFS license could be liable for patent infringement of an Oracle patent. And there have been many cases of companies suing users of software over patents.
Again, with usage this isn't a problem, the license could only possibly be broken by distribution with the GPL. Even then, I believe it is the GPL that is broken, so the patent clause would remain.
As for the rest of your argument, the OpenZFS team does a lot of work maintaining the filesystem. Why does that work need to come from you?
> As for the rest of your argument, the OpenZFS team does a lot of work maintaining the filesystem. Why does that work need to come from you?
Integration into our tools, backporting fixes, doing release engineering, tracking upstream changes, triaging and resolving distribution bug reports, documenting usage and troubleshooting, configuring defaults and best practices, a whole lot of testing, etc.
As I said, there's a lot of work that goes into a distribution (I probably haven't covered most of it) that most people don't think about. And that's assuming that a distribution is going to be passive about something as core as a filesystem -- which we wouldn't be. So we'd be working with upstream on development as well, which is more work. So saying something like "it's supported on distribution X" when that distribution doesn't even provide official packages for it is a massive stretch. It might work on distribution X, and you might provide independent ISV-style support for it, but it's not supported by us.
I appreciate that the sort of work distributions do isn't well-publicised (mostly because stability is hardly a sexy thing to blog about, and we don't rewrite things in JavaScript every weekend). But there is an incredible amount of work that goes into making distributions work well for users, and there's a reason that many distributions have lasted for so many years (there's a need for someone to do the boring work of packaging for you).
If I know that Canonical can't legally distribute ZFS in whatever format to me, and yet I use Canonical's distribution of ZFS, isn't there a legal risk there? After all, it would turn out that I have no license to use said distribution of ZFS as such a license was never conferred to me by someone with the legal right to do so.
Generally speaking, courts would probably give me the benefit of the doubt if I had no reason to believe that they couldn't distribute it to me - but as I knew they couldn't (the issues with ZFS and the Linux kernel are well-documented), and I knew I'm using it, they'd probably hold me in violation of copyright.
>If I know that Canonical can't legally distribute ZFS in whatever format to me, and yet I use Canonical's distribution of ZFS, isn't there a legal risk there?
If you somehow knew that any form of distribution was illegal that would be the case. I haven't heard anyone saying that's the case, distributing it bundled with GPL software is potentially breaks the GPL.
Well, two things. He didn't say that, he said it's been PORTED to Linux for quite some time. Which is accurate, the first "stable" release was in 2013 - 4 years qualifies as "quite some time" in pretty much any tech circle that isn't VMS.
As for it not being widely available... you're saying the only way for something to be considered widely available is to be included in the distro directly? I'd argue an easy installation/setup and solid documentation are FAR more important than being included in a distro. If the setup is arcane or the documentation is horrible, it doesn't matter if a tool is in every distro on the planet, nobody is going to use it.
I've been using ZFS on Linux for 5+ years. I know a lot of people wrote off zfs+fuse, but I used it very successfully to store 10s of TB of backup data for years with no data loss events and performance I can't tell was worse than ZFSonLinux. And I've been using ZoL for years at another job.
It has "official support" from the ZoL folks. And yes, openSUSE has ZFS packages in OBS. But we sure as hell don't ship them by default, or in our official repos. The same applies for Arch, Debian, Fedora, Gentoo, RHEL and CentOS.
Ubuntu is the only distribution that official supports ZoL, and actually ships it in it's official repositories (and by default). What that means is that Canonical is effectively saying "we trust there's no legal reason why we cannot do this." No other distribution has made that claim.
EDIT: Actually NixOS also supports it, but the point stands.
They're referring to ZoL providing support for distributions (which actually just means "it works, and if you send a bug we'll work on it"). Only Canonical provides support from the distribution side. See https://news.ycombinator.com/item?id=15088761.
BTRFS has much more flexible snapshots and clones than ZFS. You can create rewritable snapshots and create new snapshots based on those. In addition you can create COW copies of files with "cp --reflink" which ZFS doesn't support.
BTRFS feels much more like a native filesystem on Linux too. It doesn't have that big ARC cache.
We've been using BTRFS for 5+ yrs on a Linux MD RAID setup, with no problems at all.
> You can create rewritable snapshots and create new snapshots based on those.
You can do this with ZFS as well.
First, there is no such thing as a "rewritable snapshot"; the term is an oxymoron. Please do not use it and please do not promulgate it. The term "snapshot" should be reserved for read-only states of a (file) system at a particular point of time:
As long as the clone exists though, the snapshot cannot be deleted. However you can "promote" the clone, after which the snapshot can be removed. This feature has been around since at least 2010:
I was aware of the promotion feature. I was creating new clones/snapshots in a chain hierarchy in zfs, copying old backup sets progressively onto the back of the chain, but keeping the head as the current copy. This was a breeze in btrfs, but basically impossible in zfs as it refused to promote the old clones/snapshots.
As to the nomenclature, it doesn't seem to make sense to differentiate snapshots and clones. With the flexibility of btrfs they're just the same thing with a R/O flag.
> As to the nomenclature, it doesn't seem to make sense to differentiate snapshots and clones. With the flexibility of btrfs they're just the same thing with a R/O flag.
It does make sense: one is writable and the other is not. When someone is talking about (e.g.) mitigation mechanisms against ransomware, saying you have "snapshots" is meaningless if they're R/W as the ransowmare can go in an overwrite files. But if you use the term "snapshot" correctly--meaning R/O--everyone involved knows you have mitigated the risk since the data is safe from being altered and reverting is possible.
It's not the "same thing" if there is a difference between the two--which there is, the R/O flag setting. If two thing are different then they are not the "same": this may seem tautological but it's not. Call different things differently.
The btrfs CLI is really retarded in this regard where using "snapshot" is not R/O by default, as it violates decades of expectations and POLA:
The term "clone" has been used in the storage industry for several decades now. It has a very specific meaning, as does the term "snapshot". I understand it might not make sense to you personally, but you're just going to confuse anyone you're talking to referring to it by another name. You might as well start calling containers "light virtual machines". You'll get an equal number of blank stares and confusion from whomever you're talking to.
OpenSolaris is effectively dead, but its successor is the illumos project -- which has multiple distributions such as SmartOS. So you could use SmartOS+ZFS.
I love ZFS, but I'll also say that I also feel like ZFS is fairly slow. Of course, it probably doesn't help that most of my ZFS machines hold 10+TB of data, with lots of snapshots.
I don't feel like I'm going to be using btrfs anytime soon, I've given up on it. But there are days I wish I had an alternative for high reliability, snapshotting, and ideally deduplication (which usually tends to make my ZFS machines fall over).
Synology DSM is based on their own custom Linux. Nothing's changed there recently, except that with 6.x released last year, btrfs became the default file system (used to be ext4).
Same here. I lost a disk to btrfs and did not manage to recover any data. No longer on my list of filesystems-I-trust. I guess Suse is the biggest contributor since the others stopped? It doesn't give me any confidence that it can be trusted.
I've had similar experience around 5-6 years ago. One of the notebooks with ubuntu was force rebooted and btrfs got corrupted, no way to boot the system again.
It's the only FS I've ever been unable to recover data from, at all, (I've had quite a few disks with errors before going back 25 years), and so it's no longer on my list of filesystems-I-trust.