Well I used btrfs as a root filesystem for quite a while, until I realized it wa...

buster · on Aug 2, 2017

Over the recent years on every new laptop install i switched between filesystems, so i had ext3/4, btrfs and (currently) xfs on my system. I have to say, btrfs had the most glitches (a few years back), although it worked ok'ish (no data loss).

Nowadays, i must say that i very much prefer a stable filesystem with as little complicated logic as possible. I actually never use snapshots or subtrees! I never put another disk in my laptop (where would that go?!) so i don't need to do dynamic resizing (while online of course!). All this makes the filesystems a lot more complex then it has to be. I've also run into problems using ZFS on Solaris some years back which took ~2 weeks in dtracing what the hell is going on. Of course it was related to CoW.

My lessons learned: Check your requirements. Will you really need and use subtrees/snapshots/XYZ on your system? Will you really need to do online-resizing? If not, just use a stable, simple filesystem. There are perfect usecases for ZFS or btrfs. But not everyone needs the advanced features.

lorenzfx · on Aug 2, 2017

I'm using ZFS on my FreeBSD laptop. Snapshots not only make backups safer (by making sure the complete backup is taken at the same time, and by zfs sending and receiving them), the boot environments feature also make upgrading safer.

I also really like that I don't need to partition my disk, if it turns out that /tmp needs > 10% of the disk for whatever reason: no problem!

And as I like my data, I appreciate checksumming and copy-on-write.

I haven't noticed any bad slowdowns compared to ext4 on my Debian laptop I used before.

rsync · on Aug 2, 2017

"I'm using ZFS on my FreeBSD laptop. Snapshots not only make backups safer..."

Did you know you can 'zfs send' snapshots to rsync.net ?[1][2]

[1] https://arstechnica.com/information-technology/2015/12/rsync...

[2] http://www.rsync.net/products/zfsintro.html

viraptor · on Aug 2, 2017

> Will you really need and use subtrees/snapshots/XYZ on your system?

It's a valid question, but not the best one. Almost nobody needs snapshots. But they make things easier. You most likely don't need a journaled fs in your laptop either (battery level notification should take care of the issues). But it does make life better.

"Need" is not the threshold I'm interested in. Most features, I'd like. One feature I think I do need most is scrubbing, which is still absent from most filesystems :(

buster · on Aug 2, 2017

"Need" as in "will you use it?". I played around with snapshots once and never really used them. So i clearly don't have a need for them on my laptop. Journaling on the other hand helps data safety a lot and i think it's not overly complex. I've had data loss happing in the past before journaling, but never again since then. So, wouldn't i need CoW for even better "data safety"? Maybe, but since i've never experienced data loss for so many years, i don't feel like the added complexity is worth it. On my laptop, for my usecase.

But that's only me. Your experience may differ very much :)

lomnakkus · on Aug 2, 2017

> "Need" as in "will you use it?".

Well, many of us have experienced a botched system package upgrade or two. If the file system supports snapshots, then the package manager could automatically ensure fully atomic package upgrades.

That should be reason enough, I should think.

Re: The data loss issue: Yes, I've actually have XFS completely throw away a file system upon a hard power-off + boot-up cycle. (This was ages ago, I'm sure it's improved heaps since then.)

buster · on Aug 2, 2017

Good point. I'm using Debian as my Desktop for many years. I don't remember a "botched" system package upgrade in the last 5 years, but i've probably learned over years how to handle dpkg/apt.

The atomic updating is a very interesting topic and the reason why i find ostree/guix/nixos very appealing. Note that neither ostree nor guix or nixos make use of filesystem snapshots, afaik. OSTree even documents why it won't use filesystem snapshots: https://ostree.readthedocs.io/en/latest/manual/related-proje... Debians dpkg does not use snapshots as well.

So, it's a definitely a nice-to-have, but not something i need, because i can handle dpkg/apt much better then i could handle filesystem internals.

That's sort of the point: I don't want the complexity in the filesystem, but i am fine with it in userspace. I can use snapshots on filesystem level. Or i can use other backup tools in userspace. While it's certainly neat that the filesystem can do that, i'm perfectly fine with handling backups on another level.

Another example: It's certainly neat that there are a bunch of distributed filesystems (which by the way have A LOT of complexity and often can't handle all workloads you would expect from a filesystem). But i'd rather use either an S3-like network storage or build a system that scales well without relying on Ceph/Gluster/Quobyte/etc.

For example, in a hypothetical distributed system i'd rather use Cassandra and distribute data over commodity hardware then use Ceph. I'd rather handle problems with data persistence/replication on the cassandra level then debugging on file system level. Especially, when Cassandra has a problem i'll most likely be able to access all data atleast on the filesystem level. When my filesystem is borked, i'm in a much worse situation.

lomnakkus · on Aug 2, 2017

> So, it's a definitely a nice-to-have, but not something i need, because i can handle dpkg/apt much better then i could handle filesystem internals.

I don't think you're seeing my point. You wouldn't have do anything -- it would all be done automatically as long as your file system supports snapshots.

BTW, to your "I know how to use dpkg/apt": It's not about knowledge. I could well be said to be at an "advanced" level of expertise in system maintenance, but "system upgrade" fuckups had nothing to do with me, but everything to with bad packaging and/or weird circumstances such as a dist-upgrade failing midway through because some idiot cut a cable somewhere in my neighborhood.

While Nix and the like are nice and all, they're currently suffering from a distinct lack of manpower relative to the major distributions. They also don't quite fully solve the "atomic update" problem, but that's a tangent. Then, OTOH, some of them have other advantages such as the easy of maintaining your full system config in e.g. Git. Swings and roundabouts on that front. FS support for snapshots would help everybody.

buster · on Aug 3, 2017

You're absolutely right. Still, dpkg doesn't support snapshots out of the box. I could fiddle around with it and i suppose i could make a snapshot before running "apt upgrade", but since that never failed for me, i would touch something very stable for little apparent benefit. Let's say Debian 10 will support btrfs snapshots on updates, i'll consider using btrfs for the next installation, but not before.

Did you read the link from the ostree people? Let's pretend Debian 10 offers to choose between OStree-like updates and btrfs snapshots: I'd probably choose OStree and stick to ext4/xfs.

lomnakkus · on Aug 5, 2017

> Still, dpkg doesn't support snapshots out of the box.

Yes, but it SHOULD, just because ALL REASONABLE FILE SYSTEMS SHOULD SUPPORT SNAPSHOTS. Therefore dpkg should assume that such support is avaiable, or at the very least take advantage of it, when available.

Just to reiterate: You (impersonal!), the "ignorant user", shouldn't have to even have to think about it.

Does this make my point clear?

(I'm only being this obtuse because you're saying "you're absolutely right", but apparently not seeing my point. I'm assuming it's some form of miscommunication, but it's difficult to tell.)

EDIT: Hehe, I'm sorry, that sounded much more aggressive than I intended. I just think that us software developers could and should(!) do much better by our users than we(!) currently do. My excuse is that most of my stuff is web-only, so at least I can't do the accidental equivalent of "rm -rf /", but...

buster · on Aug 6, 2017

Still, i think that my filesystem shouldn't accumulate that much complexity. Maybe a layered approach similar to Redhats Stratis is a better way.

cyphar · on Aug 2, 2017

> If the file system supports snapshots, then the package manager could automatically ensure fully atomic package upgrades.

That's exactly what openSUSE / SLE do with snapper. Every upgrade or package install with YaST/zypper creates two snapshots (before/after) and you can easily rollback to an older snapshot (even doing so from GRUB). This has been enabled by default for years.

_lqaf · on Aug 2, 2017

I have a wrapper script for apt-get to do that on several machines that run Debian unstable.

Another use for snapshots is backups. I love `zfs send` - it makes backup braindead-simple.

exikyut · on Aug 2, 2017

If you're on an SSD/MTD/NVMe, you have TRIM, and scrubbing is a no-op no matter what approach you try. You need a spinning HDD for scrubbing to be useful.

Here is one way to do simple, secure scrubbing on Linux without any intrusive system changes. It is mildly restrictive, but works.

First, you need a small, dedicated partition, but it only needs to be around 16MB or so. Resizing an existing partition down (tune2fs will happily resize a mounted ext4 filesystem, but you'll probably still need to reboot to reload the partition table once you've resized that too) will give you a bit of space.

Now you have a small area of the disk that occupies a known range of sectors, and because you have no TRIM, writes to this area will be properly deterministic. Good.

Create and mount a new filesystem without a journal on the new partition. ext2 could work here (:D), you could `mkfs.ext{3,4} -O ^has_journal`, or you could use filesystem defaults and simply overwrite the entire partition with /dev/urandom later.

Make a sparse file with fallocate (make sure the file system you create the file on can handle sparse files) that is big enough to handle the biggest file.

Create a LUKS volume with a detached header inside the new sparse file, and store the detached header metadata into a file in the new journal-less partition.

Create an ordinary filesystem inside the LUKS volume.

Now you have a Rube Goldberg sparse file. You've moved the deterministic-writing/journal-less stage into a tiny key, which is a lot easier to manage than a whole gigabytes+-large partition.

As an alternative you could drop the key onto a flash drive, and nuke the flash drive when you wanted to kill the data. That's kind of wasteful though (and it carries the same flash-drive-quality risks as copying the only copy of the data itself onto the flash drive).

LUKS was designed such that if you lose the key(s) or the detached header, all that's left is statistically random garbage.

dsr_ · on Aug 2, 2017

You seem to be using a different definition of scrubbing than people talking about filesystems usually use.

Scrubbing means to read all the data off a filesystem and compare it against its checksums, so that you are confident nothing has happened to the data (hardware failures, cosmic rays, whatever).

ZFS and btrfs have specific scrub commands that do that.

There's no scrubbing available for a system which does not keep some form of checksum/crc/hash of the data.

I think that you are talking about secure delete procedures.

exikyut · on Aug 3, 2017

OH. You're right. I got the terminology confused with, uh... shredding. Heh.

I actually tried to delete this comment for unrelated reasons shortly after posting it, but was unable to. Now I feel doubly stupid.

Maakuth · on Aug 2, 2017

It's a good practice to use LVM there between disk partitions and volumes. It has negligible performance implications but makes things very flexible when you need to resize volumes or add space. You also gain reliable snapshotting from device mapper, although that does have some performance effects.

pmlnr · on Aug 2, 2017

My requirements: transparent compression. I'm left with: btrfs or, if possible, ZFS.

eternauta3k · on Aug 2, 2017

It's ok to play with the OS in order to learn.

StreamBright · on Aug 2, 2017

We are using XFS for most of our production workloads, it turned out to be an excellent choice for most data heavy use cases. Brtfs was never an option, it is a bad idea to gamble with beta technology for data storage that a production system relies on. ext4 vs xfs is a much interesting argument, I haven't had time to follow up on this.

jerven · on Aug 2, 2017

We use XFS for sparql.uniprot.org (basically columnar database with semantic graph) there we recently retested it by accident. We use 2*4 TB consumer SSD ,ok rich consumer ;) With XFS we have 10-13% faster linear write of one big file (1.3TB) and about 20% more reads serviced per minute than EXT4. EXT4 was selected by accident instead of XFS for 1 of the 2 otherwise equal machines when upgrading these with the new 4TB SSDs instead of 1TB ones which they had before.

In general I feel that the more you go into "enterprise" storage levels, the more XFS pulls ahead from the EXT family. i.e. laptops and small servers are not where the difference lies.

joosters · on Aug 2, 2017

apt-get upgrade syncs so often that 'eatmydata' gives a noticeable speedup pretty much everywhere (I got into the habit of using it for ext3/4)

I don't get why apt syncs so often - isn't the main point of log-structured file systems their ability to recover after a crash or powerloss? If so, why should you need to sync more than one every ten seconds or so?

dsr_ · on Aug 2, 2017

apt doesn't assume that you have a reliable filesystem. It assumes that you might crash at any moment, and it would be really important for you to have a consistent view of what packages are installed when you reboot.

joosters · on Aug 2, 2017

But ext3 and more advanced filesystems have been around for almost twenty years now... it seems an odd assumption that your filesystem is unreliable on any machine that isn't completely ancient (is anyone still using ext2, for instance?)

rleigh · on Aug 2, 2017

It's not about the filesystem being "unreliable". It's about having the package manager's state checkpointed so that it can recover and resume if there is e.g. power loss or any other form of interruption at any point during package installation, upgrade, removal etc. This means having all of the updated files synched on disc plus the database state which describes it.

When you move to a more advanced setup such as ZFS clones, you could do the full upgrade with a cloned snapshot, and swap it with the original once the changes were complete. This would avoid the need for all intermediate syncs--if there's a problem, you can simply restart from the starting point and throw all the intermediate state away.

yjftsjthsd-h · on Aug 2, 2017

Debian calls itself "the universal operating system" and officially supports not only multiple init systems, but multiple kernels (kfreebsd); somehow I don't see it relying on specific filesystems.

lathiat · on Aug 2, 2017

Though true I have literally never had my system get corrupt or inconsistent during failed dpkg/apt from power loss, hang, filesystem going ro, etc. It's very reliable.

I've had older rpm & yum/dnf failures multiple times leaving me in weird inconsistent states from crashes or power losses etc. not conclusive but anecdotal experience - It's also possible it's been improved.

Meanwhile you can disable the file syncing with the apt preference dpkg::unsafe-io (google will be required for the exact syntax and file in /etc/apt - fairly sure you can cmdline it also)