APFS in Detail

veidr · on June 20, 2016

What a great and valuable post, especially since this info is the result of talking to the APFS team at WWDC, and has not been published anywhere else yet.

Of particular interest (to me) was the "Checksums" section:

    Notably absent from the APFS intro talk was any mention of
    checksums....APFS checksums its own metadata but not user data.

    ...The APFS engineers I talked to cited strong ECC protection
    within Apple storage devices. Both flash SSDs and magnetic media
    HDDs use redundant data to detect and correct errors. The
    engineers contend that Apple devices basically don’t return
    bogus data.

That is utterly disappointing. SSDs have internal checksums, sure, but there are so many different ways and different points at which a bit can be flipped.

It's hard for me to imagine a worse starting point to conceive a new filesystem than "let's assume our data storage devices are perfect, and never have any faulty components or firmware bugs".

ZFS has a lot of features, but data integrity is the feature.

I get that maybe a checksumming filesystem could conceivably be too computationally expensive for the little jewelry-computers Apple is into these days, but it's a terrible omission on something that is supposed to be the new filesystem for macOS.

isomorphic · on June 20, 2016

I agree; no checksumming of user data is very disappointing. If there were performance issues, they could build checksumming into the filesystem, but make it a volume-specific option. No checksumming on the watch, strong integrity guarantees on the Mac Pro.

Their filesystem goals are in some ways consistent with Apple's (marketing) vision: Users would never have terabyte libraries of anything, as the various iServices would (should) be hosting that stuff in the cloud (where one presumes it is stored on a filesystem that actually includes data integrity). Since users won't be storing much of anything locally, Apple needn't care too much about data integrity. This is of course, nonsense.

The idea that Apple's storage devices are error-free is arrogant--but even assuming that were true, there can still be bit errors in the SATA/PCI bus, errors in memory, race conditions, gamma rays, etc. Apple uses ECC memory on their Mac Pro, so obviously someone still believes that sort of thing is possible.

chongli · on June 20, 2016

I don't see why Apple couldn't just recommend that their pro users who have need of this sort of data integrity locally run their own server with FreeBSD + ZFS. Apple has really backed off on their attempts to market OS X Server to this crowd. Heck, they're probably using FreeBSD already if they need that much data integrity.

veidr · on June 20, 2016

Here's the thing: everybody needs this sort of data integrity.

Literally nobody wants their files to be silently corrupted. ZFS made it much easier for (nerds like us) to attain very high levels of data integrity.

APFS was (and maybe still is?) a chance to make that the default for regular people.

mindajar · on June 20, 2016

Do checksums actually need to be in the filesystem, though? It does seem like an important feature, but couldn't they be done at a higher level, like the way Spotlight indexing works on the Mac today?

anexprogrammer · on June 20, 2016

It isn't just pro users.

With TB file systems, assuming you haven't outsourced everything to iCloud, data integrity matters. If you have, now you're trusting them not to screw up, ever.

From the movie or mp3 that mysteriously no longer plays, through to more important things - business data or family photos. I suspect many people have experienced bit rot, even if they don't recognise it as such. We've even reached a point where with quoted drive figures copying 2tb from one drive to another will likely result in a bit flip (source - Ars ZFS+btrfs article a couple of years back).

Heck, most people have some level of data loss from a HDD or flash drive fail. Sometimes even when they tried to do all the right things. Only question is whether it was backed up. In the case of personal users, unlikely. Self healing could have been quite some selling point!

chongli · on June 20, 2016

I have experienced many bitrotted mp3s in my day. Thankfully I've been able to replace them online. As for other files? I can't recall any that are now unable to open for mysterious reasons.

I also happen to run a home file server on FreeBSD + ZFS, though I don't think that machine has ECC memory so it is still technically vulnerable to corruption.

therein · on June 20, 2016

I hear they use RHEL nowadays.

eproxus · on June 21, 2016

Does it not matter anyway though? If the file lives locally for a while, and it rots there, the corrupt version will be synced back into the cloud and the corruption will spread. I admit the window of corruption will be smaller, but it will still be there, no?

ahl · on June 20, 2016

Talking to the Apple engineers it really didn't seem to be an issue of computation. They seemed genuine in their belief that they could solve data integrity with device qualification. While I asked them 100 questions they asked me 2: had I ever actually seen bit rot (yes), and what kind of drives did we ship with the ZFS Storage Appliance (mostly 7200 nearline drives).

kev009 · on June 20, 2016

That's dumbfounding. I know first hand a certain monthly-fee movie streaming service and the CDN I work for can tell anyone who wants to hear about handling silent corruption and bit rot and we have a relatively small fleets. At home ZFS saved me from a faulty power supply on my old workstation.

And.. the red herring here is, Apple users will want to plug in third party storage. There's just no way to contain what someone will plug in to USB and ThunderBolt, and it's insane to think APFS would not be ready to help there.

ghshephard · on June 20, 2016

That would suggest that APFS is only relevant for internal storage procured by Apple. Do they not intend for it to be used on external storage?

kevincox · on June 20, 2016

They mentioned that it would be used on removable media as well.

fanf2 · on June 21, 2016

If the crypto layer has proper MACs then presumably checksums at lower layers aren't so important. Did they give you much indication that they thought disk encryption would become standard?

blumentopf · on June 20, 2016

I've had an Intel S3500 brick within 4 weeks and a SanDisk Extreme Pro start to show occasional I/O errors after a few months. The latter doesn't just lead to bit rot, but unreadable files. With ZFS I was able to identify those with a quick zpool scrub. Which shows how valuable checksumming is even in the absence of ECC memory. At least according to my anecdotal experience, flash is much more flakey than conventional hard disks, so the assumption that stuff just doesn't happen seems ludicrous.

ghshephard · on June 20, 2016

A lot of the CoW patents, WAFL, snapshot patents that Network Appliance filed in the late 1990s have expired, or are expiring this year.

For example, https://www.google.com/patents/US6289356 was filed in 1998, so I presume it's expiring fairly soon. Given that some of the original lawsuits were Network Appliance suing Sun/Oracle, I'm wondering how much of a role this played in the timing of the release of these features? After all, Apple could pretty much pick a window to release a new file system - nothing special about 2016, that they couldn't have done this in 2015 or 2017...

Which makes me wonder if there are data integrity patents that will expire, and at such time, Apple can now drop the functionality into APFS. After all, they did say during their presentation, that the flexibility of the data format is one of the key design features of APFS.

veidr · on June 20, 2016

Woo hoo!!!! I love your theory.

No idea if you're right, but it makes Apple's otherwise baffling stance plausible.

pwg · on June 20, 2016

6289356 also lists a "priority date" on the google page of 1993. If that is an actual "priority date" rather than a google metadata "add on" then this patent expired (absent any term extensions) on June 3, 2013.

tveita · on June 20, 2016

> The engineers contend that Apple devices basically don’t return bogus data.

It's much easier to pretend that this is the case when the file system isn't verifying it.

Checksumming would probably expose problems that would otherwise go unnoticed by users or be blamed on computer gremlins. It's hard to say if doing the "correct" thing here would improve the subjective user experience. Maybe putting on airs of infallibility is the more profitable route.

deagle50 · on June 20, 2016

Good point. Going by the tone of Apple engineers' response, it sure does sound like they are going for plausible deniability.

leonroy · on June 20, 2016

Agreed. What an arrogant attitude especially considering their Macbook Pro (2015) recently had a corruption bug which necessitated a firmware update: https://support.apple.com/kb/DL1830?locale=en_GB

Good checksumming to detect bit rot is exactly what is needed since as an owner of said laptop I have NO idea whether any of my data was affected.

If Apple want to say 'the majority of our devices are mobile and checksumming puts a large performance overhead' then that's one thing. But to claim it's not needed is just plain wrong and makes me worry that Apple's product managers sit in an echo chamber hearing only what they want to hear.

asveikau · on June 20, 2016

> The engineers contend that Apple devices basically don’t return bogus data.

Holy shit.

I guess that explains why my Mac recently had a bunch of daemons burning all cpu crashing repeatedly in a tight loop when getting sqlite errors on a db in ~/Library. Cause disk corruption never happens.

rdtsc · on June 20, 2016

Hmm, I guess like there is the "sufficiently smart compiler" falacy, there is now the "sufficently reliable hardware" falacy.

But on another level, I guess if hardware fails, then well, you buy more hardware, which is good for Apple. Presumably people who bought in the past from Apple won't turn around and buy an Acer or HP laptop. They'll still buy Apple.

jakobegger · on June 20, 2016

If hardware fails silently, you won't buy more hardware. You'll just come across something odd and say, hmm, typical Apple-bugginess.

It would be much nicer if your computer said, “I've detected a bit flip, please restore this file from backup”

rdtsc · on June 20, 2016

Even more fun. Data gets corrupted, and backups pick it up, and start overwriting good backups with it eventually.

cyphar · on June 20, 2016

If your backup system involves overwriting old backups, it's not a backup system. It's a data loss system.

rdtsc · on June 20, 2016

Unless you have infinite storage, you'll have to overwrite some backups at some point in the future.

XorNot · on June 20, 2016

Storage is not that cheap yet.

cyphar · on June 20, 2016

Making backups more granular means you remove sets of backups (or you collapse incremental backups). If a new backup causes corruption to back-propogate then it's not a backup.

ptx · on June 20, 2016

What does this mean? How do you store an arbitrarily long sequence of changes on a medium of fixed size without overwriting? Eventually you will run out of disk and old data will have to be overwritten, which might have contained the only good copy of the corrupted file.

sangnoir · on June 20, 2016

> If a new backup causes corruption to back-propogate then it's not a backup

I'll go further and say that even backup that forward-propagates corruption is not backup either - all the incremental backups from the moment of corruption are worthless. Bottomline: if your backup cannot be restored with integrity intact - it's not backup!

philjohn · on June 20, 2016

It is, if you outsource the storage. Backblaze is $5 a month per computer for (virtually) unlimited storage. They keep old copies of files for 30 days.

ptx · on June 20, 2016

To paraphrase:

>>>> If your data gets silently corrupted and you keep backing up that corrupted data, eventually there won't be any backups left that have the original uncorrupted data.

>>> You shouldn't delete old backups!

>> Storage is too expensive to keep old backups forever.

> Backblaze! ... will delete backups after 30 days.

Yes. So 30 days after your file was corrupted, you will only have corrupted copies left.

adiabatty · on June 21, 2016

…that's why I use both Backblaze and Time Machine.

legulere · on June 20, 2016

Even better: automatically restore the file from backup.

ahl · on June 20, 2016

The failure mode of hardware is not all or nothing. A single sector in an HDD or a single page in an SSD can fail and the rest can be fine for years; both HDDs and SSDs have many spares for this expected condition.

nier · on June 20, 2016

When all you have is Apple’s Disk Utility.app, all storage media is perfect. That was irony. Truth is hard drives can have more than 30 bad blocks and still have a verified S.M.A.R.T. status in their app.

I recommend sending every file system engineer on a year-long journey as a traveling system integrator.

ot · on June 20, 2016

If the storage is 100% reliable, why do they checksum the metadata?

benmmurphy · on June 20, 2016

i think it makes a lot of sense. turning on these kinds of checks can be scary. the current situation is mostly no-one is effected by bit-rot. this is probably because when it rarely happens it flips some bits that don't really matter anyway. but as soon as you turn on checksumming in software without any automatic error correction people are going to start freaking out when their files become inaccessible or they have to jump through some hoops to access the 'corrupted' file which looks entirely fine to them anyway.

same deal with some heap protections. say you are running a kernel which doesn't have byte patterns to detect heap overflows or reuse after free. maybe you have some heap overflows which because of their nature never cause any corruption but now you turn on heap protections and peoples kernels are getting more panics :/

josho · on June 20, 2016

What is the user experience for when a checksumming filesystem detects an error?

If the fs detects a bit error does it flag the file as entirely unreadable? Move it to lost+found? Force me to restore the file from a backup? All these options seem more scary for an end user than blissful ignorance.

Don't misunderstand me, I've lost a few family photos over the years due to bit rot. So, I appreciate a fs that offers more protections. But, I honestly don't know offhand how an end user would recover from an error in /System or even an error in a family photo, or for that matter a word doc.

glhaynes · on June 20, 2016

If the fs detects a bit error does it flag the file as entirely unreadable? Move it to lost+found? Force me to restore the file from a backup?

For files stored in iCloud Drive, if that version of the file exists in the cloud, the OS could automatically re-fetch the file. But, yeah, for lots of circumstances there's not going to be a "good" option to give the user.

EDIT: Same applies to Time Machine (or whatever Apple's backup solution will be called in the APFS era).

mindajar · on June 20, 2016

It was a stealthy feature addition that went totally unannounced, but as of 10.11, Time Machine stores file checksums in the backup. See 'tmutil verifyChecksums'.

speleding · on June 20, 2016

Perhaps you (or Apple) would still be able to achieve the checksum feature by a smart choice of encryption algorithm?

APFS has file level encryption, so you would in theory be able to detect a flip by selecting an encryption algorithm that gives error upon decrypting modified data. I could see this being worked into apps_fsck at some point.

A similar case could be made for adding it into the compression algorithm, which the OP thinks will be coming to APFS later, popular algorithms such as deflate already have this built in.

ksec · on June 21, 2016

Same thoughts here as well, but how does encryption correct data?

speleding · on June 22, 2016

Correcting data is much harder, and would require a significant amount of additional storage to provide enough redundancy to be able to deduce the original data. But detecting is good enough for many uses, you would be able to restore the files from the Time Machine before those get silently corrupted as well.

codemac · on June 20, 2016

Checksums are usually very fast to compute as the ARM CPUs of any modern phone have crypto engines, and their laptops do as well. I think trading data protection for performance reasons would be pretty irrational.

cm3 · on June 20, 2016

Except that with NVME drives, and the parallel operations you can run on these, the performance of checksums becomes important again. Recent experiments with HAMMER2: https://www.dragonflydigest.com/2016/06/15/18281.html

codemac · on June 20, 2016

Yeah, the ever present march of storage --> memory has really put a strain on our current compute architectures. Thanks for the note, will be reading more about it.

rodgerd · on June 20, 2016

> ZFS has a lot of features, but data integrity is the feature.

And in the Sun era you might be prepared to bet your business on not being on the wrong end of a lawsuit from the owner of the various patents and copyrights around Sun IP.

Only a completely insane person would argue that's a good idea now.

mahyarm · on June 20, 2016

External hard drives / SSDs that are using APFS? I thought that would be a pretty obvious use case.

KaiserPro · on June 20, 2016

Either the engineer is young, or hasn't been doing systems programming for long enough.

amelius · on June 20, 2016

Perhaps they assume that you will sync everything to the iCloud anyway (?)

Freaky · on June 20, 2016

How are you expected to know you need to restore from the backup if the damage is silent?

How can you have confidence in your backup if damaged data can be silently written to it?

amluto · on June 20, 2016

> I get that maybe a checksumming filesystem could conceivably be too computationally expensive for the little jewelry-computers Apple is into these days, but it's a terrible omission on something that is supposed to be the new filesystem for macOS.

Checksumming has another cost that isn't immediately obvious. Suppose you write to a file and the writes are cached. Then the filesystem starts to flush to disk. On a conventional filesystem, you can keep writing to the dirty page while the disk DMAs data out of it. On a checksumming filesystem, you can't: you have to compute the checksum and then write out data consistent with the checksum. This means you either have to delay user code that tries to write, or you have to copy the page, or you need hardware support for checksumming while writing.

On Linux, this type of delay is called "stable pages", and it destroys performance on some workloads on btrfs.

sangnoir · on June 20, 2016

For desktop computing, I'll take data integrity over good 'performance' any day. The use-cases for iDevices might be different, coloring Apples perspective.

gmac · on June 20, 2016

Slightly worried by the vibe that comes off this. "I asked him about looking for inspiration in other modern file systems ... he was aware of them, but didn’t delve too deeply for fear, he said, of tainting himself". And (to paraphrase): 'bit-rot? What's that?'.

I would have hoped that a new filesystem with such wide future adoption would have come from a roomful of smart people with lots of experience of (for example) contributing to various modern filesystems, understanding their strengths and weaknesses, and dealing with data corruption issues in the field. This doesn't come across that way at all.

cm3 · on June 20, 2016

Given Dominic's other output, I'm going to believe there's more to the story because he didn't strike me as someone who would actively ignore past innovations. I know it's a popular concept to NIH stuff when devs believe they know enough to build it, but so much stuff is just built poorly without consideration for existing designs and it shows in the poor software we have to live with.

kev009 · on June 20, 2016

Not looking at HAMMER/HAMMER2 is just NIH too, no 'taint' from BSD as Apple should well know on their UNIX side.

sho_hn · on June 20, 2016

I'm extremely confused by this:

> With APFS, if you copy a file within the same file system (or possibly the same container; more on this later), no data is actually duplicated. [...] I haven’t see this offered in other file systems [...]

To my knowledge, this is what cp --reflink does on GNU/Linux on a supporting filesystem, most notably btrfs, and has been doing by default in newer combinations of the kernel and GNU coreutils.

This guy seems too well-informed and experienced in the domain to miss something so obvious, though. So what am I missing?

Also interesting to me is the paragraph about prioritizing certain I/O requests to optimize interactive latency: On Linux this is done by the I/O scheduler, exchangable and agnostic to the filesystem. Perhaps greater insight into the filesystem could aid I/O scheduling (this has been the argument for moving RAID code into filesystems as well, though, which APFS opts against) -- hearing a well-informed opinion on this point would be interesting. Unless this post gets it wrong and I/O scheduling isn't technically implemented in APFS either.

It seems like this perspective might be one written from within a Solaris/ZFS bubble and further hamstrung by macOS' closed-source development model. Which is interesting in light of the Giampaolo quote about intentionally not looking closely at the competition, either.

ahl · on June 20, 2016

This guy (i.e. me) wasn't aware of this functionality in btrfs. Are reflinks commonly used? Yes, I know more about ZFS than the other filesystems mentioned.

pixelbeat · on June 20, 2016

Note mv tries reflink by default (when moving files across BTRFS subvols) since doesn't need a separate copy of the data. Now as storage systems evolve there are less guarantees that one does get multiple copies, with deduplicating at lower layers etc., and therefore cp may change at some stage to reflinking by default, especially as clone_file_range() moves to the VFS level. Actual data redundancy would then achieved at a higher level with separate file systems, devices, data centers, ... where arguably it needs to happen now anyway.

e12e · on June 20, 2016

I wasn't aware of --reflink either, apparently support is coming to XFS as well (at some point): https://pkalever.wordpress.com/2016/01/22/xfs-reflinks-tutor...

idorosen · on June 20, 2016

In my opinion, APFS does not seem to improve upon ZFS in several key areas (compression, sending/receiving snapshots, dedup, etc.). Apple is reimplementing many features already implemented in OpenZFS, btrfs (which itself reimplemented a lot of ZFS features), BSD HAMMER, etc.

Maybe extending one of these existing filesystems to add any functionality Apple needs on top of its existing features (and, hopefully, contributing that back to the open source implementation) would cost more person-hours than implementing APFS from scratch. Maybe not.

Either way, we will now have yet another filesystem to contend with, implement in non-Darwin kernels (maybe), and this adds to the overall support overhead of all operating systems that want to be compatible with Apple devices. Since the older versions of macOS (OSX) don't support APFS, only HFS+, this means Apple and others will also have to continue supporting HFS+. It just seems wasteful of everyone's time to me.

Also: https://xkcd.com/927/

vvhn · on June 20, 2016

>and this adds to the overall support overhead of all operating systems that want to be compatible with Apple devices.

What operating systems describe tremselves as being "compatible with Apple devices"

> Since the older versions of macOS (OSX) don't support APFS, only HFS+, this means Apple and others will also have to continue supporting HFS+.

Who else actually "supports" HFS+ ? Sure there are linux "ports" based on the spec but nobody claims them as being "supported". Apple would have had to continue supporting HFS+ whether they chose to implement ZFS, btrfs or HAMMER.

>It just seems wasteful of everyone's time to me.

I don't know how Apple writing their own filesystem is wasteful of anybody else's time ( except possibly Apple's and/or Disk utility software for vendors for OS X)

>Also: https://xkcd.com/927/

The standard is the interface ( POSIX / SUS ) and unless APFS breaks that how is this applicable ?

idorosen · on June 20, 2016

> What operating systems describe tremselves as being "compatible with Apple devices"

I was referring to the Linux kernel modules implementing HFS+ and other Apple FSes.

> Who else actually "supports" HFS+ ? Sure there are linux "ports" based on the spec but nobody claims them as being "supported".

Yes, by support I meant other developers who want to be able to read and write to devices in APFS format.

> Apple would have had to continue supporting HFS+ whether they chose to implement ZFS, btrfs or HAMMER.

Yes, Apple would have to continue supporting HFS+, but other kernel developers would not have to port yet another filesystem (APFS) with all of its own quirks; and, who knows, maybe it would be less work for Apple to inherit ZFS/btrfs/HAMMER/some other filesystem's solutions to some of the same problems they're trying to solve from scratch here. My point was more that by reinventing the wheel to implement some of these features, they've created not just more work for themselves potentially, but more for the open source kernel development community as well in the long run.

> I don't know how Apple writing their own filesystem is wasteful of anybody else's time ( except possibly Apple's and/or Disk utility software for vendors for OS X)

APFS will find its way to external HDDs/SSDs/flash drives, etc., then in order to read those filesystems someone else will have to port it to any other devices/readers of that device/FS.

> The standard is the interface ( POSIX / SUS ) and unless APFS breaks that how is this applicable ?

I didn't mention POSIX, VFS, or filesystem _interfaces_. The analogy to the XKCD strip was that we already had N filesystems that have a large subset of (or in some cases superset of) the features of APFS as of right now, now we have N+1 complex filesystems to contend with and port and interoperate with in other kernels/OSes (mainly Linux + non-Darwin BSDs).

This may just be the price of progress, which is fine. I think it'll be fantastic if Apple makes progress in this area and improves upon the work of others. The developer seemed to be ignoring history so as not to "taint" himself (did he mean IP/legally tainted?), which is slightly worrying to me.

I hope Apple open sources their implementation under a BSD/GPL dual license to make it easier for others to port it directly into other kernels, rather than having to reimplement it themselves.

mindajar · on June 20, 2016

Classic comic, but I don't think it applies. APFS looks intended to solve Apple's product problems really well, and it doesn't even try to be a filesystem for everyone.

Apple has said from time to time that they're all about owning and controlling the key technologies that go into their products. APFS makes a lot of sense from that perspective, and this seems one of those cases where going their own way is better than importing someone else's constraints. ZFS on an Apple Watch? LOL.

tfar · on June 20, 2016

I would not be surprised if one could write a ZFS implementation optimized for more constrained devices. If you already know you are going to a have flash storage you can probably ditch some of the N layers of cache you see in common ZFS implementations. Not that ZFS is a one size fits all, but the file systems specification could be implemented in more than one way.

mindajar · on June 20, 2016

If you assume Apple cares about having a disk format in common with other platforms, sure, I'd agree that's probably possible. But I don't think they do; they seem to care a lot more about things like a unified codebase across their platforms, the energy-efficiency initiatives they've been pushing for a few years, owning the key tech in the products, etc.

One slide in the WWDC talk deck showed a bunch of divergent Apple storage technologies across all their platforms that are being replaced by APFS. If ZFS has to fork into weird variants to run well on the phone or watch, that seems less appealing than a single codebase optimized for just the stuff Apple products do.

idorosen · on June 20, 2016

Should have said s/standards/filesystems/g... :-)

I was reacting to the idea of APFS for macOS, as well as having yet another filesystem to deal with on external media that interacts with multiple computers (HDDs/SSDs/USB flash drives/etc.).

mindajar · on June 20, 2016

Is moving data between computers that way a thing that non-technical people do often? FAT-formatted USB sticks seem to be good enough for that, but e-mail/Dropbox/file sharing/cloud sharing/AirDrop have much better UX for the average person.

idorosen · on June 21, 2016

Yes, it is a thing people do. The problem is, those non-technical people do not understand on-disk formats. (Nowadays, most USB sticks come preformatted as ExFAT.) There are also offline and low bandwidth situations. In healthcare it's common too thanks to HIPAA and nervous hospitals: let's say a patient wants to transfer a set of MRI or CT scan images/videos (typically provided on CD or DVD, in which case it's ISO9660, but sometimes USB stick - hopefully ExFAT but sometimes worse).

millstone · on June 20, 2016

APFS is targeted to run on low-powered devices like the Apple Watch. It may be that the alternatives cannot be made suitable for such devices: ZFS is famously memory hungry, HAMMER says it's not designed for < 50GB devices.

michaelmrose · on June 20, 2016

ZFS requires a lot of ram to enable on line deduplication.

From Freebsd Mastery: ZFS Pg 135

"For a rough-and-dirty approximation, you can assume that 1 TB of deduplicated data uses about 5 GB of RAM. You can more closely approximate memory needs for your particular data by looking at your data pool and doing some math. We recommend always doing the math and computing how much RAM your data needs, then using the most pessimistic result. If the math gives you a number above 5 GB, use your math. If not, assume 5 GB per terabyte."

Otherwise I think this is like the myth that ZFS requires expensive ECC ram whereas ECC ram is recommended for any filesystem and zfs needs it no more nor less.

vvhn · on June 20, 2016

From the 2nd edition of "The Design and Implementation of the FreeBSD Operating System"

ZFS, Pg 549

" However it is not designed for or well suited to run on resource constrained systems using 32 bit CPUs with less than 8 Gbyte of memory and one small nearly full disk, which is typical of many embedded systems "

michaelmrose · on June 21, 2016

Even mobile cpus will be all 64 bit before long and it seemed to run great with 4gb others reported success on systems with 2gb. Meanwhile ram available in all sorts of devices is ever increasing.

What fs a laptop/workstation uses shouldn't be determined by whats suitable for a watch in 2016.

scarface74 · on June 20, 2016

Is ZFS lightweight enough to run on the watch? Their ultimate goal was to have a file system efficient enough and flexible enough to run across all of their OS's - MacOS, iOS, tvOs, and watchOS

toast0 · on June 20, 2016

The Apple watch has 8 GB of storage and 512 Mb of ram; I don't think that's an unreasonable ratio. Most people discussing zfs memory use have large arrays or deduplication enabled; I'm not sure why dedup would be very useful on a watch. Maybe CPU for checksums is an issue? Apple could probably add acceleration for the checksum algorithm/use an algorithm that was faster/skip the checksums on read + do a sweep when plugged in and fully charged. An 8gb sad only takes minutes to do a complete read, as opposed to a large rotational drive, so you would likely be able to do a full scan in a reasonable amount of time.

ahl · on June 20, 2016

Unclear, but a member of the ZFS development team at Apple told me that Giampaolo complained that ZFS would never work on the phone. So the team demonstrated it working on the phone. I know of no obstacles to making it work on the watch other than engineering.

blumentopf · on June 20, 2016

So they still have a team working on ZFS? Could the anonymous "ilovezfs" involved with OpenZFSonOSX and coming from a California IP address be part of this team?

comex · on June 20, 2016

> Either way, we will now have yet another filesystem to contend with, implement in non-Darwin kernels (maybe), and this adds to the overall support overhead of all operating systems that want to be compatible with Apple devices.

On the other hand, if Apple decides to open source the APFS implementation (hard to tell what their plans are from current statements, but I'm holding out hope), it'll probably be under a permissive license that allows porting to Linux. The implementation is in C (not C++) so porting is probably generally feasible. Compare to ZFS, which, even if some distros have finally started shipping it, will never quite be free of licensing issues unless Oracle does a 180.

matthewmcg · on June 20, 2016

Agreed. A few years ago Apple was on the verge of using ZFS but cancelled the project due to licensing issues.

See http://arstechnica.com/apple/2009/10/apple-abandons-zfs-on-m...

snarf · on June 20, 2016

A better/more up to date article on this from the same blog as this item: http://dtrace.org/blogs/ahl/2016/06/15/apple_and_zfs/

lanna · on June 20, 2016

Breaking changes sometimes are unavoidable, progress has to be made.

ghshephard · on June 20, 2016

For example, my 1TB SSD includes 1TB (2^30 = 1024^3) bytes of flash but only reports 931GB of available space, sneakily matching the storage industry’s self-serving definition of 1TB (1000^3 = 1 trillion bytes).

Great article, but a couple nitpicking corrections (which seem appropriate for a storage article) Per: https://en.wikipedia.org/wiki/Terabyte - Terabyte is 1000^4, not 1000^3.

Also It's been 6+ years since we all agreed that TiB means 2^40 or 1024^4, and TB means 10^12. Indeed, only in the case of memory does "T" ever mean 2^40 anyways. It's always been the case that in both data rates, as well as storage, that T means 10^12. This convention is strong enough that we most of us just have thrown up our hands and agree when referring to DRAM memory, that Terabyte will mean 1024^4, and 1000^4 everywhere else.

Indeed, in the rare case where someone uses TiB to refer to a data rate, they are almost without exception incorrectly using it, and, they actually mean TB.

TazeTSchnitzel · on June 20, 2016

> Also, APFS removes the most common way of a user achieving local data redundancy: copying files. A copied file in APFS actually creates a lightweight clone with no duplicated data.

No, it doesn't. APFS supports copying files, if you want that. It's just that the default in Finder is to make a “clone” (copy-on-write).

ahl · on June 20, 2016

Fair enough; and right now cp doesn't use the fast clone functionality, but it assuredly will. I'm not sure 'cat <file >file.dup' is reasonable for most users.

ghshephard · on June 20, 2016

Just teach everyone to use dd instead - dd if=file of=file.dup :-)

TazeTSchnitzel · on June 20, 2016

If they patched it to use a different syscall, maybe.

cm3 · on June 20, 2016

I'm still looking for a widely supported (at least FreeBSD and Linux kernels) filesystem for external drives to carry around that doesn't have the FAT32 limitations. There's exFAT but no stable and supported implementation. Then there's NTFS, but that's also not 100% reliable in my experience when used through FUSE (NTFS-3G). I've considered UFS but that also was a no go. I'm hopeful for lklfuse[1] that also runs on FreeBSD and givess access to ext4, xfs, etc. in a way like Rump and allows you to use the same drivers on FreeBSD. I'm cautious though, given that I don't want corrupted data I might notice too late. Let's see if lklfuse provides LUKS as well, otherwise Dragonfly's LUKS implementation might need to be ported to FreeBSD or something like that. External drives one might lose need to be encrypted.

[1] https://www.freshports.org/sysutils/fusefs-lkl/

drvdevd · on June 20, 2016

Thanks for this! I didn't realize (or had forgotten) LUKS had been ported to Dragonfly. Also you touch upon my #1 frustration with APFS without really knowing anything about it: simple portability.

cm3 · on June 20, 2016

Yeah, I believe it's by the same Dragonfly developer who also wrote tcplay[1] for TrueCrypt volumes.

[1] https://leaf.dragonflybsd.org/cgi/web-man?command=tcplay&sec...

niftich · on June 20, 2016

The file-level deduplication [1] is interesting. Not being a filesystem expert, this sounds like it fulfills a similar usecase to snapshots [2]. Or am I reading this wrong?

Is NTFS's shadow copy like Snapshots?

[1] http://dtrace.org/blogs/ahl/2016/06/19/apfs-part3/#apfs-clon...

[2] http://dtrace.org/blogs/ahl/2016/06/19/apfs-part2/#apfs-snap...

rincebrain · on June 20, 2016

NTFS Shadow Copies are more like LVM/ZFS snapshots than APFS's file-level CoW snapshots, in that they both operate on the entire volume at block-level, rather than having a per-file level of granularity.

There are other FSes that allow the behavior that APFS is demonstrating - look at OCFS2 and Btrfs, both of which allow you to do cp --reflink.

amelius · on June 20, 2016

I think the value of this new proprietary filesystem is limited, since you can't run it on servers (Apple does not make servers anymore). Also, compatibility/porting issues may become a problem if you build your software for it.