I don't know if it's possible with FreeNAS, but on an ordinary Debian box nothing stops you running ZFS on LUKS encrypted devices. I've been using this configuration without issues for about a year on a home file server.
Storage is a really weak area of mine, but it is important, and I do not see much technical discussion here anyway, so I am going to expose myself:
How does integrity protection work in practice? I know that the bits on the HDD itself do not map one to one to bits you can actually use and that it uses that to protect us from flipped bits.
I know that beyond that RAID/RAIDZ is supposed to help. I guess redundancy alone does not help. Two copies would not help to determine which bit is the correct one, you would need three. Once you have these, I guess you would be able to detect corruption. Am I guessing correct that duplication beyond 3 times would just help with read speed and in the case when a drive fails while one is already dead (e.g. when you are rebuilding)?
What then? Does RAID/ZFS automatically fix it? I imagine at some point the drive would have to be replaced, do you have to check that manually or does it detect that by itself? I imagine after that you would have to run a rebuild.
So I guess my question would be: Could I put a FreeNas in a corner a room and leave it there? Would it just blink when I need to put a new drive in or would it need more maintenance? Of course talking about worst case scenarios here, so say I want this to live for 10 years sitting in the corner.
RAID generally does not protect against bit-rot (i.e. undetected errors); it only protects against detectable errors / catastrophic disk failure.
ZFS does protect against undetected (at the disk level) errors. To a first approximation, it does this by keeping block checksums alongside pointers so that it can verify that a block has not been changed since it was referenced, by keeping multiple checksummed root blocks, and by having enough redundancy to reconstruct blocks that fail their checksums. Naturally, there are information-theoretical limits to the number of corruptions that may occur for detection/correction to be guaranteed.
You should refer to the relevant wikipedia pages if you want more detail than that :p
edit: To answer your actual question, yes, you should be able to leave a ZFS box in the corner unattended for years with reasonable confidence that your data is safe and uncorrupted (tune redundancy settings to your taste on the space efficiency vs. error tolerance trade-off). Two caveats: 1) the machine must have an effective means of communicating a catastrophic disk failure for you to resolve (this should hold for any made-for-purpose NAS device, but you'll need to do some work if you're DIY). 2) ZFS does not actively patrol for corruptions, it fixes problems when it encounters them. If your data is not being periodically accessed, there will need to be some provision for walking the filesystem and correcting any accumulated errors (the ZFS scrub util exists for this purpose, but it has to be used)
One thing I didn't consider until just now (sigh) is that if you're using ZFS as a backup medium (I am), unless your source drive is also ZFS (mine isn't), you're still exposed to the same bit-rot your source drive is, since that change would then get backed up to ZFS.
> Naturally, there are information-theoretical limits to the number of corruptions that may occur for detection/correction to be guaranteed.
To my practical point: Does it tell me, when it approaches that limit or do I have to put in more maintenance? Can it be fixed by swapping one of the drives?
If you are using ZFS, the flipped bit (or "bit rot") problem is completely solved. You need never give it another thought.
You still need to worry about failing drives and about the integrity of your raidz arrays (or whatever), but that has nothing to do with the flipping bits.
That being said, you can see statistics about error corrections (which should typically be near-zero) and if you see a lot of them it might be advanced warning of a drive dying. But the actual bit errors themselves would not be a problem and you would not need to take any action specifically related to them.
The limits are on the number of simultaneous failures. Data is safe provided that too many errors do not accumulate before they can be detected and corrected. This is fundamental fact: no real storage system can tolerate an unbounded number of simultaneous errors without an unbounded number of space for replicas. You can control the number of allowable simultaneous errors by tuning redundancy settings (the trade off is space efficiency vs. probability of data loss). It is straightforward to put in place an automated process to guarantee that errors are detected within some finite period of time.
> Two copies would not help to determine which bit is the correct one, you would need three. Once you have these, I guess you would be able to detect corruption. Am I guessing correct that duplication beyond 3 times would just help with read speed and in the case when a drive fails while one is already dead (e.g. when you are rebuilding)?
You're confusing two topics. Integrity vs. availability. 3x copies, RAID4 XOR, RAIDZ (a form of RAID6), etc are about increasing the availability of your data in the face of device failure/partitions.
The integrity of you data is computed different ways in different systems, but you can safely think of it in your head as a pile of checksums on top of eachother. There is a half decent blog entry[0] on Oracle's website, but you can mentally model it as having checksum on disk for each block, which then is combined with other blocks into some larger object that is also checksummed, and on and on until you have "end-to-end integrity". It is a feature of all serious storage systems.
I'd love if someone who was more familiar with ZFS could respond with how ZFS's internals work for you.. I worked at a direct competitor of Sun's that sued eachother over this stuff, so I actively tried to not learn about ZFS. I regret this.
It seems that you might have edited your penultimate sentence, but just to make sure that the record is unequivocally clear on this: NetApp and Sun didn't simply "sue each other", NetApp initiated patent litigation against Sun in East Texas, and Sun countersued as a defensive maneuver.[1]
As for your ZFS question, Bonwick's blog is certainly a good source -- though if one is looking for a thorough treatment, I might recommend "Reliability Analysis of ZFS" by Asim Kadav and Abhishek Rajimwale.[2]
I did not edit that part IIRC, I try not to edit anything substantially that I post online.
As far as NetApp v. Sun, it was very unfortunate. I could rant for days about how crappy the business reasons were for NetApp going after Sun (zomg coraid!).
I use your quote about Oracle being a lawn mower from your usenix presentation all the time. Thank you for your reply.
So many levels to this. Drives themselves remap sectors to handle bad sectors, and some have integrated ECC bits... there are also things at the sensor level to help recover from errors. Once you are at the block level, you have simple parity checking with some redundancy models, but the good ones use ECC bits. Usually this is done at the block device layer, but ZFS and some other filesystems do this at the filesystem layer (sometimes you have multiple layers of this). You're right that beyond a few copies, the main win tends to be read speed, but there is one other factor: surviving rebuilds. As drives get larger, the failure rate is such that the chance of failure during a rebuild gets too disturbingly high, so you need additional redundancy to avoid catastrophic failure during recovery.
As for "automatically fix it", the short answer is there is a lot of stuff that automatically fixes problems, but it is a leaky abstraction. RAID-5 rebuilds are notoriously terrible for performance, and often it is easier to have logic for dealing with failures & redundancy at the application layer.
FreeNAS and similar projects are definitely intended to be turnkey storage solutions. They have their strengths & weaknesses, but the notion that you just plug it in and go isn't too far off. Usually you don't go with the blinky light, but with an alerting mechanism (e-mail, SMS, whatever) that you integrate with it for notifications about problems. In principle, it is a fire & forget kind of thing.
An HDD/SSD without ECC will not work, and wouldn't have worked for a long time now. Many of the reads performed will require some ECC use.
ECC is always written alongside the actual block and the overhead for ECC is the reason for the move from 512b sectors to 4Kb sectors in HDDs. For SSDs the data is already written in different block sizes depending on the NAND and the internal representation and ECC is done for larger than 512b units.
The probability of failure during rebuild is not really directly linked to drive size, the usual interpretation of the drive BER is wrong (media BER is stated across a large population of drives rather than just one drive).
> An HDD/SSD without ECC will not work, and wouldn't have worked for a long time now. Many of the reads performed will require some ECC use.
Yeah, I expressed that badly. I always forget about the ECC bytes that are in the firmware.
> The probability of failure during rebuild is not really directly linked to drive size, the usual interpretation of the drive BER is wrong (media BER is stated across a large population of drives rather than just one drive).
Regardless of interpretations of BER, I can't agree about drive sizes. The phenomenon of failures during rebuild is well documented and the driving principle behind double-parity RAID. Adam Leventhal (who ought to know this stuff better than either of us) wrote a paper several years back on the need for triple-parity RAID, and it was entirely driven by increased drive densities: http://queue.acm.org/detail.cfm?id=1670144
The reality is the higher drive densities mean you lose more bytes at a time when you have a drive failure, and that means more bytes you want to have "recovered".
Adam Leventhal uses the wrong interpretation of the BER value so I wouldn't take his words at face value. The whole argument of Adam is by the disk BER and not by density.
I do assume though that the RAID array does media scrub (BMS) periodically, if you don't you are at risk anyway and I'd call that negligence in maintaining your RAID.
If you do use scrubbing the risk that another drive has a bad media spot is low as it must have developed in the time from the last scan and that is a bounded time (week, two, four) so the risk of two drives having a bad sector is now even lower (though never non-zero, backups are still a thing). If you couple that also with TLER and proper media handling instead of dropping the disk on media error the risk to the data becomes very low since there isn't a very high likelyhood that two disks will have a bad sector in the same stripe.
I've been working with HDDs and SSDs and developing software for enterprise storage systems for a number of years now, I've worked in XIV and for all the thousands of systems and hundred of thousand disks of many models I've never seen two disks fail to read the same stripe and RAID recovery was always possible. Other problems are more likely sooner than an actual RAID failure (technician shutting down the system by pressing the UPS buttons or a software bug).
I did learn of several failure modes that can increase the risk but they depend on specific workloads that are not generally applicable. One of those is that if you write to one track all too often you may affect nearby tracks and if the workload is high enough you don't give the disk the time to fix this (the disk tracks this failure mode and will workaround it in idle time). In such a case same stripe can be affected in multiple disks and the time to develop may (or may not) be shorter than the media scrub time. And even then a thin provisioned raid structure would reduce the risk of this failure mode and giving disks some rest time (even just a few seconds) would allow the drive to fix this and other problems it knows about.
So why would one ever bother with RAID-6 (or its moral equivalent with RAID-Z)? I've yet to hear someone justify it from a requirement being to actually survive two drives failing simultaneously.
The main problem is not two drives failing simultaneously but rather one failing shortly after the other. The more drives you have the more the likelihood of that increases, the rebuild time also increases with large drives and so you are more open to such a second disk failure. If you take the MTTF of the drive (or the mostly equivalent probability of failure) you find the risk increases since the MTTF of the drives does not increase at all with size, it is rather constant at least as it is specified by the vendors (HDDs are usually 1.2M hours). The more drives you have the more likely one of them to have a failure and once one fails and since the rebuild time increases with size your chance of failure during rebuild increases as well.
Some systems mitigate that by having a more evenly distributed RAID such that the rebuild time doesn't increase that much as the drive size increases and is actually rather low. XIV systems are like that.
> The more drives you have the more likely one of them to have a failure and once one fails and since the rebuild time increases with size your chance of failure during rebuild increases as well.
That was exactly what I was saying. Given the same sustained transfer rate, the bigger the drive, the longer the rebuild time, hence the greater the chance you'll have a failure during the rebuild. While you might think they would, throughput increases have not grown to match increases in bit density & storage capacity.
SSD's have helped a bit in this area because their failure profiles are different and under the covers they are kind of like a giant array of mini devices, but AFAIK they still present challenges during RAID rebuilds.
Yes, the two key things: population of drives (model/version possibly, but ultimately not publicly known what the sample size is), and the rate is generally a maximum. i.e. use of "less than" symbol with the rate. Therefore, it's wrong to say every x bytes you should expect an unrecoverable read error. We have no idea how good the drives actually are, the reporting is in orders of magnitude anyway so it could be 8 or 9 times more reliable than the reported rate; or even multiple orders of magnitude. It's not untrue to say < 1 URE in 10^14 bits, but for the population to actually experience 1 URE in 10^16 bits. Why not advertise that? Well, maybe there's a product that costs a little more than advertises 1 URE in 10^15 bits. And other product classes that promise better.
Give your computer ECC memory. Configure ZFS to scrub data regularly. It can be configured to email you of impending (SMART) or actual failure. You can label drives by ID/SN, or if your chassis or controller supports it, it can blink to tell you what drive to replace.
It can be configured with a hot spare so the "resilver" starts immediately on failure detection, not "when you happen to fix".
Let me add that you really want to have a hot spare, and if you can't do that (e.g. no extra bay in the chassis), then you should not delay on replacing a failed disk. Have a spare disk on hand, and take the time to replace it ASAP.
We lost everything on a FreeNAS system once because we were alerted to a failed-disk error, but decided to wait until the next week to replace it. But then a second disk failed, and we lost the storage pool. Lesson learned!
But this was an operator error, not a system problem. FreeNAS has been tremendously stable and reliable for us.
Everyone makes that mistake once. It is an understated problem in building fault tolerant storage arrays.
We buy a batch of 20 drives all at the same time, and they are all the same manufacturer, model, size etc... Possibly even from the same batch or date of manufacture. Then we put them in continuous use in the same room, at the same temperature, in the same chassis. Finally they have an almost identical amount of reads/writes.
Then we act shocked that two drives fail within a short interval of each other :)
Just a couple of weeks ago, I upgraded my home storage system from 9 disks (4x3 raidz) from the same manufacturer with partially consecutive serials.
Now its 18 disks (3x6 raidz2) from 3 different manufacturers and every vdev has 2 of each. And the vdevs are physically evenly spread throughout the case.
I sleep so much better. It was kind of a miracle the first setup survived the 4.5 years it did.
I've had a few drives fail on my ZFS for Linux fileserver and wondered why my hot spares weren't automatically kicking in, and this is why.
On Linux, if you don't use the zed script that's referenced in that Github issue above and just replace a failing drive manually, a hot spare is worse than useless, because you need to remove the hot spare from the array before you can use it with a manual replace operation.
> How does integrity protection work in practice? I know that the bits on the HDD itself do not map one to one to bits you can actually use and that it uses that to protect us from flipped bits.
Now that you have this "error-correcting checksum", where do you put it? Raid5 means you place the error-correcting checksum on different disks.
If you have three disks: A, B, and C. You'll put the data on A & B, then the checksum on C. This is RAID4 (which is never used).
RAID5 is much like RAID4, except you also cycle between the drives. So the checksum information is stored on A, B, or C. Sometimes the data is on A&B, sometimes is B&C, and sometimes its on A&C.
-----------------------
> Could I put a FreeNas in a corner a room and leave it there? Would it just blink when I need to put a new drive in or would it need more maintenance? Of course talking about worst case scenarios here, so say I want this to live for 10 years sitting in the corner.
Maybe, maybe not. If all the drives fail in those 10 years, of course not (Hard Drive arms may lose lubricant. If the arms stop moving, you won't be able to read the data).
"Good practice" means that you want to boot up the ZFS box and run a "scrub" on it every few months, to ensure all the hard drives are actually working. If one fails, you replace it and rebuild all the checksums (or the data from the checksums).
RAID / ZFS isn't a magic bullet. They just buy you additional time: time where your rig begins to break but is still in a repairable position.
ZFS has more checksums everywhere to check for a few more cases than simple RAID5. But otherwise, the fundamentals remain the same. You need to regularly check for broken hard drives and then replace them before too many hard drives break.
---------
This also means that no single box can protect you from a natural disaster: fires, floods, earthquakes... these can destroy your backup all at once. If all hard drives fail at the same time, you lose your data.
if you intend to try this, I would recommend getting spinrite at grc.com, and run it a couple times every year, if you're going to be doing a lot of io on it.
I run it on all my hdd's and its kept them alive and running smoothly for years at a time.
When I built my home NAS there wasn't an off the shelf FreeNAS option and it was definitely a "research all the things, build your own system" with the huge caveat of "Did you put enough RAM in that?".
The 8-bay one is particularly good value, rivalling similar systems by QNAP, and I personally do have a QNAP and if I were to buy again today it would definitely be one of these FreeNAS boxes instead.
I think the generation 8 "HP Microserver" is a good system.
One limitation is the 16GB max ram, but given that it's for home use with (presumably) large files and a small number of simultaneous clients, it shouldn't be a problem - even if you use 4x 8 TB disks.
The key is to populate it with ECC ram which should be an absolute requirement for any ZFS system.
Just bought one of these for exactly this! Going with 4 3TB disks in raidz2 because it was the best price/TB I could find without taking some external 5TB drives out of their enclosures.
I was just about to post this. I run OMV at home, and haven't had any problems with it.
I've used both FreeNAS and OMV, and find the OMV community more welcoming and helpful than the FreeNAS one. The FreeNAS community has organized itself with the intent of focusing exclusively on very specific very corporate use case. If you don't fit they're use case, you are bad and should feel bad. OMV on the other hand is much more open and willing to help people use adapt it to their own needs.
With the notable exception of ZFS vs EXT4, they underlying software appears to be the same, which makes the community distinction every more cogent.
FreeNAS has worked well for me for two years on a dedicated home NAS server I built with a mini-ITX motherboard. I chose DIY NAS over a NAS appliance like Synology's, because I can readily obtain a replacement for any hardware that fails into the indefinite future.
I keep a copy of the FreeNAS's storage on a second DIY NAS running NAS4free, which is the original base from which FreeNAS was forked some years ago.
One popular thing I don't do with my NAS boxes is run any non-NAS services e.g. media streaming, etc. For me it feels like an unnecessary risk to important data.
Modern virtualization stuff like VT-d means you can pass through your hard drive controllers to a FreeNAS VM and have it work just the same as if it were running natively. In fact, you can just keep a backup of your FreeNAS configuration on a flash drive install, so if your Hypervisor or VM fall apart, then you just boot to the flash drive without a hitch.
Not to mention, the jails system in FreeNAS has been in flux in the past. It never quite worked well for me, so offloading the "app" stuff to a workhorse is what I did.
Another open source option is Synology's Disk Station Manager which includes a bunch of open source software they've skinned and created mobile apps for so you can localize stuff like music streaming, dropbox, notes etc - https://www.synology.com/en-global/dsm/live_demo
I cannot find the source code of DSM. Are you sure it is Open Source?
If it isn't Open Source this is off-topic since there is a large number of other commercial NAS OS e.g. Thecus, SoftNAS, QNAP, Napp-It... Most of them tend to have an "app store" and some people building more or less working apps out of Open Source and free projects.
Your link is people complaining that the source isn't being released fast enough... and Synology has told them it will be put on SourceForge when they're ready. There seems to be a lot of precedent for that to go how they said it will.
There's also an independent community that facilitate installing and using it on your own hardware including the latest version that's still waiting on the source dump.
Sure, no one needs that source code anyway because they are not adding anything of value. But if they are not complying the GPL license they are legally not allowed to use the GPL projects.
Yes, on x86 systems it mimics a DS3615. I've seen it installed on cheap dell desktops, like an optiplex 755.
Works well, with some limitations. I dont think their automatic port forwarding service (quickconnect) works well - since that service checks for a valid Synology MAC address.
Additionally - you can't do the automatic system updates as it might break certain portions that XPEnology overrides.
The code base isn't open source. They modify the linux kernel so that is available (or will be soon). But Xpenology simply only allows you to run their closed source code on any computer.
Although I don't think I'd ever DIY another NAS box, I would suggest the following, use a common chipset controller that exposes each drive/device separately with all information, some don't. Also, prefer a system with ECC, Asus supports ECC on most AMD FX supporting motherboards for the less expensive option there. Also, have 2 spare drives on hand.
I went back to Synology after my uper-nas crashed and burned (really bad run of 3TB Seagate drives, 9 out of 12 dead in under 3 years)... Now debating between a 4x8TB or 5x6tb in a future nas box.
Those 3TB Seagate drives are literally the worst in the business. The 2TB and 4TB are pretty good (roughly on par or better than competitors according to the backblaze reports) but those 3TB drives were utter shit for some reason, ~20% year over year failure rates.
Some retailers have even started to sell the 3TB Seagate drives at almost the same price as the 2TB ones. Probably the only way to get rid of them.
Please remember that back blaze was cracking open external hard drives in that case. Also, at the time, one of the large HDD manufacturers lost their main build plants. So who knows where Seagate got their components from in that 1-2 year period. Given the fact that Seagate's 4TB hard drives are considered by back blaze as the best HDDs, I find it a little odd that everyone assumes that manufacturing issues 4 years ago affect drives being built today.
Also, back blaze statistics only apply if you're operating at back blaze's scale. If you're making a home NAS you would have to be _incredibly_ unlucky to hit a bad drive as quickly as back blaze did. I
I've anecdotally experienced very high failure rates on them too, had 3/6 fail in about 5 years. And many other anecdotal reports indicate similarly bad results. Maybe not as bad as Backblaze but pretty damn bad.
I absolutely agree though, the 4TB are a great improvement, you can't beat them for reliability unless you're really willing to pay for it. I've chosen them for my personal NAS and some servers even with great results so far.
Well that's reassuring... I just built my home NAS. 3 3TB WD Reds and 3 3TB Seagate NAS drives. Two of the Seagates were DOA so I got them replaced. I ran full hard drive tests on them without problems but now I'm extra paranoid.
That's what RAID is for, just make sure you scrub it often enough to catch any potential failure. If you're ever looking for replacements, those 4 TB Seagates are hard to beat without paying enterprise-level prices, despite how much of a turn-off the high failure rate on their 3TB drives may be.
I wanted to do something like this, but after reading about the failure rates of drives and raid, once one disk fails, it seems like you're fucked. Anybody got some stories or numbers that prove that while repairing my raid for one disk failure I won't just destroy the rest of my drives?
Most raid failures are the fault of the user not understanding how raid works and leaving their raid improperly managed.
First, buy good drives. Buying desktop drives that suck are a great way to end up with a corrupt SAN. In general you want drives that support TLER so if something goes wrong, you hear about it right away, rather than the drive trying to fix the problem silently while causing the disk to have long delays. You don't have to go super expensive enterprise, but buying from the bottom of the barrel, or is a 'green' or efficient drive is a great way to lose data.
Second, no matter if you are using hardware or software raid, set up the monitoring utilities properly. You need an immediate alert if a drive is going bad or has failed. SMART should be enabled and when it alerts, replace the drive. The software should also do a verify, patrol read, or scrub (different terms for the same thing) that occasionally check the surface media of the disk and the validity of the data. If anything comes back with an error, you replace the disk right then.
Lastly, one of the big issues why people lose raids all at once is they go out and buy a huge stack of the same kind of disk, made on the same day, with the same firmware, and possibly the same inattentive QA person. Sometimes when problems crop up, it can be a systemic problem with that model, and when all you have is one model, bye bye everything.
> I wanted to do something like this, but after reading about the failure rates of drives and raid, once one disk fails, it seems like you're fucked. Anybody got some stories or numbers that prove that while repairing my raid for one disk failure I won't just destroy the rest of my drives?
Yeah, I've recovered several failed drives without issue, even on LVM systems where some partitions failed because they were striped but others worked because they were mirrored. No real issues overall.
The only case where you might run into problems is when you get conflicting data between 2 drives, but I've never seen that happen in the real world. Some people will scare you into thinking that happens often but it hasn't been my experience. Often one drive just goes kaput completely or has an entire section which is completely unreadable or extremely slow, for which recovery is just a matter of marking that drive offline, swapping a new drive in and telling it to rebuild. The only case in which you lose everything is if you use RAID0. If you can do RAID10 that's the way to go IMO.
RAID5 can be a little riskier because if one drive is damaged and one fails then you can wind up losing the bunch, but I use it on my personal NAS because it's mostly just TV and the important stuff is syncthing'd to my other machines. Regular scrubbing prevents these issues usually.
RAID6 is pretty safe. If you're using ZFS that's a RAIDZ2, you can lose 2 drives in a volume.
RAID5 can be pretty risky with large drives that are the same type.
>conflicting data between 2 drives, but I've never seen that happen in the real world
Heh, run enough servers and you'll see everything eventually. Had a really fun one where a hardware raid card was telling us that writes were successful, but when data was coming back corrupt. It was writing bad data to two of the drives in the array.
I've got a FreeNAS box which has endured two disk failures no problem. I have scheduled weekly rebuilds which I'm hoping finds bad blocks ahead of a failure-base rebuild.
I don't know if this has been mentioned yet in the conversation, but it's very important to know that raid is _not_ a backup. You still need some kind of backup solution even with a RAID system.
if you want to learn more about this topic, come on over to www.reddit.com/r/datahoarder and read and ask about it. We're friendly.
If you have a way to deal with the noise then getting a 2nd gen rack server off ebay is a good way to go. At the least you'll get more ram and hard drive bays for your money.
I had a similar bad run with Seagate 3TB drives..3 out of 4 in 2 years. I was really thinking it was the NAS, but switched to WD Reds and have had no issues in 18 months.
That isn't to say that reds/greens don't have issues... I've seen one of each fail, but they have been very solid compared to my experience with Seagate drives... the shame is that may very well be different today, but brand (dis)loyalty tends to carry for longer terms.
Correct me if I'm wrong (please!) but FreeNAS still can't (easily?) do what Synology's 'Hybrid RAID' (basically sharded RAID across drives) facilitates; growing your (redundant, e.g. 1 or more parity drives) volume by adding more drives, and also growing your volume by replacing drives with larger ones.
I've always thought that was essential, it bemuses me that there aren't many solutions out there that help you do this. I guess everyone must buy have loads of disposable income to buy 8 drives at once.
I'm not sure about FreeNAS, but I do want to point out this is possible in the btrfs filesystem (sharded raid across drives). I'm not aware of any FreeNAS equivalents though utilizing it, but rolling your own is certainly an option.
I've been using Unraid for a few months. It has a different - arguably inferior - parity strategy but has amazing support for VMs. I've got my gaming machine, my GPU dev box and a bunch of smaller services all in the same box and can do some really sweet online configuration.
How is it for gaming? I'd imagine it's using a VM solution (Xen HVM or KVM so it can do IOMMU?). I've been avoiding it because I figure that VM solution would get detected by anti-cheat in many games (particularly the nasty kernel level ones like EasyAntiCheat, GameGuard, etc).
I have 8x2.8 ghz Xeon cores 32 GB of RAM and a 970 GTX in my gaming vm. Works perfectly for DOOM, Witcher 3, all at 1920x1080 on high settings (maybe not full but damn pretty). Consensus on the forum seems to be you lose low single digit performance.
It's a KVM hypervisor and I haven't played online, why would they ban you for using a VM?
A VM makes it possible to arbitrarily read and modify memory from outside of the operating system in a way that the anti-cheat can't otherwise detect (heck, some VMs have full out integrated debuggers), so anti-cheat and DRM schemes used by games may disallow VMs.
Interesting... Just last week I started building a much simpler/smaller setup based also on ZFS for personal use:
- rpi 3 (running debian testing)
- 2x1Tb disks (with usb docking)
- zfs (raidz) on the disks
- ssh through tor for tunneling purposes
I can backup my personal computers and run some extra stuff on the rpi. Yes, it's only USB2, but I don't care. I mostly send incremental snapshots from my computer and it's fast enough. (e.g. zfs send -i yesterday home/user@today | ssh rpi zfs recv home/user)
I'm tempted on building a small recv server that can allow partial transfers, but so far it works for me. Also, as soon as encryption is enabled on zfs-linux, I'm turning that on :)
Total budget: ~$180 (the most expensive part were obviously the HDDs)
These offerings work well as a pure NAS - your data is backed up and available.
Where they fall down for me as a home user is in their inability to offer other services. For example, photo sharing. I have a fast home internet connection and all my photos are already on the NAS. But the photo sharing capabilities are very, very weak; so inevitably I have to use another service (ie Google Photos, Dropbox, Flickr) if I want to make those photos available to myself and others while away.
The same goes for other forms of media sharing, self-hosted email, etc.. The boxes offer these things - and the appeal is obvious to the end-user - but the potency is so weak as to be essentially unusable.
I run FreeNAS on a HP Microserver, got 4 3TB HGST drives, and a stick of ECC memory to bump the box to 10GB.
Given that the OS needs to run somewhere, and one would rather use the 4 bays for drives in the zpool, I used the recommended method of running the OS from a USB drive. Sounded strange to me, but it's worked a treat. I saw that there are kits out there for replacing the empty CD slot in the Microserver with an SSD for the same purpose, but I'd rather not touch something that's working now.
I went for RAIDZ1, despite a lot of internet protestations about it, as I basically only store movies/music/Time Machine backup there anyway.
Been thinking about backup solutions for everything, but it all seems fairly expensive for what is essentially just backing up media.
To people recommending Plex for media streaming, I would actually recommend Serviio over it. It took a little extra setup in a jail, requiring some extra port downloads, but it works extremely well in it's DLNA streaming to my PS3. Where it shone brighter than Plex for me was it's support for transcoding and subtitles - it transcodes and shows all of my H264 video files with ASS subtitles with all of the style intact. It also supported PGS subtitles, but my piddling processor didn't quite match up to it.
It's funny that we've come so far in computing and yet reliable data storage is still such an unsolved problem for 99.9% of the world's consumers.
All the major solutions have their drawbacks. For example, btrfs' present day raid code is in such bad shape that apparently it needs a total rewrite. zfs' volumes can't grow without a resliver, which is not something a home user should ever have to do.
From a design perspective, the internals of both btrfs and zfs are only accessible to experts and totally opaque to their users. I've long thought someone could come along and write a FUSE-based distributed storage layer that sat on top of existing file system abstractions like ext4, etc. And that a system built that way could just crush both of those solutions in almost every conceivable way. I've toyed with writing it myself.
This approach might not be the fastest, but it would be the most flexible, possibly very reliable, and as a bonus it would be really easy for average users to grok and maximize their data recovery from all but the most catastrophic events.
"Reliable data storage" is easy. Engrave the data into a metal plate, or chisel it into rocks. It's when you start adding things like "huge amounts of data" and "readable by computer" and "able to stream in real-time" and "interoperate with multiple devices" and "fits in your hand" and and and...
Hyperbole, yes, but the point is that the devil is in the details, and there's a hell of a lot of caveats to the term "reliable data storage". For example, in the modern era, if someone had a way to make ye olde floppy disks 100% reliable for 10 decades, they still wouldn't be used by that 99.9%, simply due to the low data capacity.
One thing that's been unclear to me with ZFS: How should I design a ZFS-based storage system for extensibility? From what I've read, it seems like you can't add devices to a pool once it's been created. I'd love to create a small-ish NAS right now to store, say, the top 10% of my blu-ray collection, and then incrementally upgrade it over time as I get funds.
The easiest way to do this is to use mirror vdevs, that way you can add disks two at a time. Of course, you miss out on RAIDZ functionality that way, but that's the simplest setup.
I've gone ahead and actually created a new array using btrfs raid10 instead (with an eye towards a RAIDZ2/3 like configuration whenever the btrfs guys get around to fixing it); btrfs allows for online reshaping of the mirroring configuration, so it doesn't have the limitations of ZFS in this instance.
I'm still not sure what setup I have. I really should have written down every decision I've made while building my NAS (I think I've now found articles that disagree with every single choice I made)
$ zpool status
pool: BoxODisks
state: ONLINE
scan: scrub repaired 0 in 9h20m with 0 errors on Sun Jul 31 01:31:31 2016
config:
NAME STATE READ WRITE CKSUM
BoxODisks ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/18059e22-b6a4-11e5-9cca-0cc47a6bbf34 ONLINE 0 0 0
gptid/194649b1-b6a4-11e5-9cca-0cc47a6bbf34 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gptid/1a86b3cc-b6a4-11e5-9cca-0cc47a6bbf34 ONLINE 0 0 0
gptid/1bcd3ad6-b6a4-11e5-9cca-0cc47a6bbf34 ONLINE 0 0 0
cache
diskid/DISK-S24ZNWAG903847Lp1 ONLINE 0 0 0
I think I have mirrored vdevs.
What does that mean for replacing failed disk/s? What does that mean when I want to upgrade to 8TB drives? The biggest "mistake" I made was buying a very expensive 4 bay enclosure - 4 drives is not enough.
When you want to upgrade, you can offline and replace each disk, then resilver, in turn. If the pool is set to autoexpand, you'll eventually get a total capacity of 16TB. You can do this for all the disks, or for each mirror separately in two groups of two.
Unless you want to reshape your array. Then md-raid is superior. Btrfs's raid implementation has had some pretty bad issues recently and has been declared "experimental" as a result.
Apologies, I only saw the recent mailing list post which basically said "we're removing the ability to even compile support for raid56". I didn't mean to say that btrfs was overly-zealous in their trust of new code (I personally am super excited for the future of btrfs).
Yeah, I saw that before, but that isn't quite what I wanted, if I'm reading it right. That tells me how to replace all the drives in my array with new ones; I'd prefer to (for instance) start with 4x 4TB drives, and add new ones as my storage needs build up.
You can expand a pool one RAID array at a time, but growing one drive at a time isn't really possible except by not using RAID; ZFS doesn't have a true equivalent to the BTRFS rebalance operation.
It synchronizes directory trees on arbitrary filesystems, so I think the answer to your question would be "asynchronous".
Most individuals don't have TBs of working set continually being updated and requiring a central authoritative copy. They have a small working set and a long-term archive they'd just never want to lose.
Unison's model isn't perfect (eg having to create a star topology, lack of built-in inotify). But it's been around forever, is written in a sane language, and is rock solid.
I have bad memories of freenas (before the split/new version) many years ago when I had a disk and power failure... hdb went, so, c/d moved up a letter and it automatically screwed up raid...
... I was able to recover, but, I remember how many hours I spent on it.
FreeNAS is a rewrite compared to before the split. iXSystems got involved and helped move things along at a significantly better pace. They've contributed a lot.
A split caused by Olivier waking up one morning and announcing that freenas is dead because he wants to rewrite 0.8 on Linux because reasons. I switched back across to freebsd right about then as I lost all trust in the project (now NAS4Free).
Currently running FreeNAS for the last 2 years since last machine upgrade.
> it's ZFS backed, screwing up the RAID is going to be pretty close to impossible
Even if it's ZFS it can get quite messed up if you set it up based on sda, sdb, etc like naming conventions. In that case entire zpools can fail to import in case of failure.
Therefore the best practice is to use device-by-id, so that if a device like sda falls out, the rest of your array should still resolve correctly.
And how the array is created is entirely a FreeNAS thing. Hopefully one which have, like OP says, improved.
As soon as I get home, I'm going to swap some of my HDD cables and see what happens... I've basically been assuming that NAS4Free uses device-by-id like any sane OS should.
NAS4Free and FreeNAS are just relabeled FreeBSD with GUI stuff pasted on. FreeBSD doesn't have /dev/disk/by_{id,label,partuuid,path,uuid}. It's its biggest failing IMO. Their /dev/diskid is NOT what it sounds like.
That said, ZFS pools under FreeBSD and derivatives, just as under linux, are imported from info stored on the drives themselves. They should be able to reboot after scrambling the cables. It would take more nerve, or craziness, than I've got to PROVE that using any pools I care about.
However, I have completely screwed a ZFS replace command in FreeBSD following a bad drive in a RAID-Z3 pool. It dutifully replaced a perfectly good drive with my new drive. But no harm was done to the pool. After it finished, I ran another zfs replace, replacing the bad drive with the "mistake" old drive. It all worked!
ZFS is so brilliant, be careful or it will put your eyes out :-)
I used ZFS in two production instances a few years ago on Solaris, and had two instances of a bug that totally corrupted the superblock on each drive, meaning that all data was gone with no way to recover.
Now I'm back to HW RAID, LVM, and ext4, which works a treat, and I'll be using that until I have more faith in ZFS.
At least I know what my options are to recover from that. When I tried to resolve my ZFS issues, no tools existed to try help recover because it wasn't a failure mode that had been considered as an option apparently.
You had incredibly bad luck then unless you weren't running Solaris on Sun hardware, in which case you may have shot yourself in the foot with a poor disk controller and/or no ECC.
is it not obvious that you can't pull any hardware out of the bin and expect ZFS to be OK with it? it has requirements just like many modern OS features. Run virtualization without any CPU extensions to accelerate it and then tell me you're upset when your VMs kill your host with excessive resource consumption.
I have been using FreeNAS for a few years now. It has been great for me, both as a backup and a Plex server. I am really excited for 10.0 when it will have docker support instead of the current jails. It will be much better for some random tinkering too.
This is probably a really stupid question, but for what kinds of applications or scenarios are people using FreeNAS? I'm interested in hearing about regular users, not business or enterprises.
Upon seeing FreeNAS Mini, I have to say that I'm really interested. I've become increasingly uncomfortable with the idea of keeping all of my data "in the cloud". Being able to keep a local version of all my media would help give me peace of mind.
Is maintaining your own "personal cloud", if you will, a reasonable use-case for this? What are the proper expectations from a maintenance perspective? i.e. How often should I expect things to break or require tuning / fixing?
I use FreeBSD+ZFS rather than FreeNAS, but it's the same idea. I store all of my photos and videos, all of my music, and all of my movies. The music and movies are served via Plex and Sonos to various parts of the house, and I use Lightroom to browse the photos.
This is not the only copy of any of this stuff, though. The photos get synced to S3 and backed up on a second local hard drive. All of the movies I have are rips of DVDs and Blu-Rays I still own, and can use if needed. And my music is all in Apple Music/iTunes Match (and on my computer, backed up via both Time Machine and Crashplan).
So, I'm using the server really as convenient local storage for stuff, not as an alternative to the cloud.
I do love FreeNAS, at home and at the office too (for development purposes - I'd trust it for production but with our client base it's better to be more "corporate").
It has a good cross section of features from 'home' to 'pro', AFP/NFS/CIFS/iSCSI, directory integration, plugins (more aimed at home, with things like news downloaders, BT clients).
I never really understood the system requirements for FreeNAS. It would be something that would work great on some small old system, but at 8gb Recommended the system requirements is basically the opposite of a small old system.
Can anyone explain the reason for this? What makes the specs so hardcore?
It's just a recommendation. Remember, some people use FreeNAS professionally, so they're going to err on the side of recommending too much (= enterprise). Less will probably work fine, as long as you aren't running dedupe, compression, or heaps of jails and services. E.g. if you have 4 drives and just use the SMB service to connect to one or two machines, 4GiB is probably okay.
Having said that, RAM is pretty cheap. 8GiB isn't a huge ask for any machine in 2016. Any system with less than 4GiB, is it really worth bothering with redundancy/ZFS? UFS (which is far less RAM hungry) + nightly backups might be good enough. (Edit: of course, you miss out on the awesome ZFS features with UFS. It's never easy, is it?)
Simple. ZFS has memory requirements to operate efficiently. If you have less <8Gb of RAM, which is by the way as small as it gets, then you probably have not a lot of drives, therefore won't benefit from ZFS nor FreeNAS.
Kind of like when you don't need a full blown CRM to manage two customers.
I think NASes should become the things of the past. They are too fragile, too maintenance-greedy and don't scale enough for the modern world. I'd much rather see a tiny distro with ready to use Swift, Gluster, Ceph, whatever, but not ZFS (sorry ZFS fans).
EDIT: Let me explain a bit more: once you start running a distributed system you stop caring about OS, filesystem and disk level issues, because you have redundancy on another level. And it makes all the difference. You don't worry anymore, you can always just reboot, hard-reset or take a node out to investigate. Suddenly you realize that it's not a big deal even if some node starts freezing or some process starts OOMing and crashing, you don't care, you just let them.
If dont want to care for the OS, who do you expect to keep it in check? Someone has to worry about it: you can try and abstract it away and minimize it but its never gone.
I see this line of thought that somehow self-hosted "clouds"/clusters look after themselves, but thats usually not the case.
You'd come close by buying QNAP or Synology hardware, they provide software updates (including OS). Even buying software solutions like Unraid. I dont know how maintainable FreeNAS is for someone who doesnt want to worry about OS tho..
> I see this line of thought that somehow self-hosted "clouds"/clusters look after themselves, but thats usually not the case.
They don't really look after themselves, but do handle failures on the highest possible level. Which makes it unnecessary to keep each OS in check. What's critical for a single NAS box is critical for a distributed storage only if all boxes of some replica have the same problem at the exact same time, otherwise you just reboot and move on and it doesn't matter if it happens again, it doesn't cause any downtime.
I actually speak from my own experience. I run and maintain a distributed key-value storage for many years. Although I designed and implemented it myself (and redesigned a bunch of times), I don't see how experience with those other distributed storages, like Swift, would be any different.
I use NAS4Free which is similar, on an HP Microserver N54L (it has ECC RAM) in a RAID-Z1 configuration and a cron job set up to zfs-scrub the drives monthly. If you're going to DIY a NAS, I can heartily recommend both NAS4Free and the HP Microserver.
Thank you for that - it was interesting at least, even if I've probably done it all wrong (TLDR - compression on ZFS definitely varies with the data you try to compress)
Fun fact: while researching this I found yet another article that contradicts the information I had when I built my zpool, and apparently I've done it all wrong (again).
In most modern systems the rate at which the CPU can compress data is many, many times faster than the disk can write data. In general compression greatly increases write speeds because less disk IO is needed. The vast majority of benchmarks show that to be the case too.
TL:DR, if you notice a performance decrease after enabling compression, something is wrong with your NAS/SAN.
That said, if you are just storing large encrypted files on your ZFS then compression isn't needed.
Let's not get ahead of ourselves. This is only indicating that they aren't exposing any query filters in the GUI and searching for all objects from the root is overwhelming the GUI if you have a large AD environment.
May as well ignore the "RAID is dead" screams. They are only FUD. Do make sure to run the scrub periodically though, that's the real defense against rebuild failures.
I actually have only used Nas4Free. I know that FreeNAS is technically the fork (despite having the original name), but the technology IMO is quite solid.
Its important to know all of the competition however. Here's a basic overview of the technologies:
* Nas4Free -- ZFS-based. Free as in beer and Free as in OSS. More barebones and simple than FreeNAS. As it is based on FreeBSD, you need to be somewhat careful about hardware choices, although in my experience FreeBSD seems to support hardware that I'm interested in.
* FreeNAS -- ZFS-based. Forked from Nas4Free and name-shenanigans happened. Newer web-gui and more plugins. Can't speak too much about it, since I haven't played with it.
-------
* Windows Storage Spaces -- Windows8 and up have ReFS + Storage Spaces as their ZFS-competitor. Runs a daemon in the background to automatically check for bitrot (unlike NAS4Free / FreeNAS where you need to schedule a cronjob). Comes as part of Windows, if building a dedicated system you need to pay the $100 Windows Tax. Best hardware compatibility available. No head-scratching about random AMD A10 / FM2+ motherboards with obscure drivers (FM2+ compatibility is not listed on FreeBSD yet)... you know everything has Windows compatibility.
Windows Storage Spaces are superior technologically to ZFS IMO. You can extend a storage space after building it, while ZFS volumes are locked to a specific size. (You can add more drives to a ZFS mirror, but this only increases reliability). Start with 2-hard drives and then extend the storage space to 6-drives later.
You can stripe data to increase a ZFS pool size, but this doesn't keep the same level of reliability. The Windows Storage Space methodology where you overpromise on storage size (and then later build out capacity) just seems to be an easier methodology to work with in the long term.
--------
ZFS has more features, but nobody uses them. ZFS supports dedup, but all documentation I've seen says its not worth it.
I guess one important feature ZFS has is that it supports L2ARC / ARC caching for SSD Acceleration.
Windows ReFS does not. ReFS is also not a complete solution. Parity is implemented at the "Storage Space" level, not at the filesystem level. I don't think this is a major downside, but it is important to note that ReFS + Storage Spaces is the complete solution. (Whereas ZFS stands alone)
Note: Snapshots (Called 'Shadow Copies' in Windows land) exist on NTFS.
--------
* Synology -- Out-of-the box systems, usually built on Intel Atom. I find that the 2-disk options are cheap, but the 4-bay or 6-bay options are outrageously expensive. I can definitely build a cheaper WINDOWS system than most 4-bay Synology Box.
Synology is mostly a soft-RAID setup. I don't see much on bitrot or other storage issues. I hope they handle it? But I'm not 100% sure.
>I guess one important feature ZFS has is that it supports L2ARC / ARC caching for SSD Acceleration.
Which can massively accelerate small file load times. Using this on a VM pool greatly speeds things up.
>Also ZFS has compression which can drastically reduce data usage.
Which in some virtual environments decreased our data usage by 50 times or more.
Windows storage spaces suck balls on speed. Using it in the Microsoft recommend methods to avoid data loss or corruption make it even slower.
Oh, and just throwing files on ReFS and using it directly with services and such is a great way to get weird issues if you don't understand the filesystem are different.
>while ZFS volumes are locked to a specific size
Specific number of disks. You increase the disks size and you can grow you raid.
Both FreeNAS and NAS4Free are nothing more than FreeBSD with some GUI stuff pasted on. I am comfortable using ssh and [ba]sh, so I couldn't care less about the GUI. I just use the real thing: FreeBSD.
Warning: there is a well documented bug with the FreeNAS boot loader's interaction with Dell PowerEdge servers. Be prepared to spend hours getting it to install if that's your hardware.
Always remember to ask about the "HN Readers Discount".
[1] http://www.rsync.net/products/zfsintro.html
[2] http://arstechnica.com/information-technology/2015/12/rsync-...