> it's ZFS backed, screwing up the RAID is going to be pretty close to impossible
Even if it's ZFS it can get quite messed up if you set it up based on sda, sdb, etc like naming conventions. In that case entire zpools can fail to import in case of failure.
Therefore the best practice is to use device-by-id, so that if a device like sda falls out, the rest of your array should still resolve correctly.
And how the array is created is entirely a FreeNAS thing. Hopefully one which have, like OP says, improved.
As soon as I get home, I'm going to swap some of my HDD cables and see what happens... I've basically been assuming that NAS4Free uses device-by-id like any sane OS should.
NAS4Free and FreeNAS are just relabeled FreeBSD with GUI stuff pasted on. FreeBSD doesn't have /dev/disk/by_{id,label,partuuid,path,uuid}. It's its biggest failing IMO. Their /dev/diskid is NOT what it sounds like.
That said, ZFS pools under FreeBSD and derivatives, just as under linux, are imported from info stored on the drives themselves. They should be able to reboot after scrambling the cables. It would take more nerve, or craziness, than I've got to PROVE that using any pools I care about.
However, I have completely screwed a ZFS replace command in FreeBSD following a bad drive in a RAID-Z3 pool. It dutifully replaced a perfectly good drive with my new drive. But no harm was done to the pool. After it finished, I ran another zfs replace, replacing the bad drive with the "mistake" old drive. It all worked!
ZFS is so brilliant, be careful or it will put your eyes out :-)
I used ZFS in two production instances a few years ago on Solaris, and had two instances of a bug that totally corrupted the superblock on each drive, meaning that all data was gone with no way to recover.
Now I'm back to HW RAID, LVM, and ext4, which works a treat, and I'll be using that until I have more faith in ZFS.
At least I know what my options are to recover from that. When I tried to resolve my ZFS issues, no tools existed to try help recover because it wasn't a failure mode that had been considered as an option apparently.
You had incredibly bad luck then unless you weren't running Solaris on Sun hardware, in which case you may have shot yourself in the foot with a poor disk controller and/or no ECC.
is it not obvious that you can't pull any hardware out of the bin and expect ZFS to be OK with it? it has requirements just like many modern OS features. Run virtualization without any CPU extensions to accelerate it and then tell me you're upset when your VMs kill your host with excessive resource consumption.
you can lose the system disk (usb key) that holds your settings, but the data you actually care about will be safe.