There are plenty of instances of this bug being brought up on the mailing list. ...

linuxready · on Aug 24, 2017

Thanks for the explanation. I was hoping for a Github issue number (or Bugzilla or whatever) to easily track this bug, but perhaps the Btrfs dev team doesn't work with issue number ?

At least for RAID1, it seems that implementing RAID1 N-way mirroring would ease the process to recover from a failed drive. In case of drive failure, we could use the remaining drive in read-only mode to copy the data to a new drive, hence creating a RAID1 array with two working drives and one failed drive. The OS should then allow to boot in rw mode, and from there it is easy to remove the failed drive from the RAID1 array.

However it seems that RAID1 N-way mirroring (with N > 2) is not even on the roadmap at this moment.

Have I misunderstood something or does this approach make sense ?

wtallis · on Aug 24, 2017

You can do RAID1 with more than two drives, but you'll only get two copies of each chunk of data. In this scenario, when one drive dies you can still write new data in RAID1 to the remaining space on the surviving drives, so mounting the FS writeable in degraded mode doesn't risk leaving the FS in a state where the safety is hard to determine on the next mount. If space permits, you can also rebalance before even shutting down to remove the failed drive, also avoiding the corner case.

Being able to do N-way mirroring with three or more copies of the data would be nice, but it's not necessary; 2-way mirroring across 3 or more drives is sufficient, and the hot spare feature will be more widely useful.

linuxready · on Aug 24, 2017

I'm not sure that I have understood.

I was referring to this sequence of events: 1) 2-way mirroring across 2 drives 2) one drive fails 3) buy and plug a new drive 4) rebalance to have 3-way mirroring across 3 drives (with one being out): this is currently not possible 5) remove the failed drive, ending with 2-way mirroring across 2 drives

But it seems that you are referring to: 1) 2-way mirroring across 3 drives 2) one drive fails 3) rebalance to have 2-way mirroring across the 2 working drives 4) remove the failed drive, ending with 2-way mirroring across 2 drives

I assume that people don't/won't start the initial RAID1 with 3 drives.

Anyway, I would find 3-way mirroring across 3 drives very useful as it gives a simple identical foolproof process to replace a faulty hard drive, whether it has just a few corrupted data (but still readable) or have completely failed : just plug a new drive, rebalance, reboot and remove the defective drive.

wtallis · on Aug 24, 2017

> rebalance to have 3-way mirroring across 3 drives (with one being out): this is currently not possible

I'm not sure this even has meaning. But anyways, it's probably pointless to try to kick off a rebalance when the FS is still trying to use a dead drive. Either use the device replace command (which isn't stable yet), or tell btrfs to delete the dead drive then add the replacement drive. If the problem drive is failing but not completely dead yet, then the device replace command is supposed to move data over with a minimum of excess changes to drives other than the ones being removed and added. But the device replace command doesn't properly handle drives with bad sectors yet, so the separate remove and add actions are more reliable albeit slower and put more work on the other drives in the array.

linuxready · on Aug 24, 2017

You are probably right, bad idea.

Where can I find the proper official procedure to replace a failed drive (i.e. which cannot be mounted anymore) in a RAID1 array (with 2 drives) ?

I found these 2 links: https://unix.stackexchange.com/questions/334228/btrfs-raid1-... https://unix.stackexchange.com/questions/227560/how-to-repla...

But it is written there that if 'replace' doesn't work, it can take up to 5 days using add/remove for only 100GB !

I haven't found any official procedure on the Btrfs wiki.