What every programmer should know about solid-state drives

wmf · on Feb 14, 2015

I wonder if it's counterproductive to tell every programmer they should know an entire book about SSDs (or DRAM, cf. Drepper), especially when the later chapters tell you that you have no control over the stuff you learned in the earlier chapters due to the FTL and filesystem.

oofabz · on Feb 14, 2015

"What every programmer should know" is only the title of the last chapter, a summary. The book as a whole is called "Coding for SSDs".

greggarious · on Feb 15, 2015

If we had always said "why should a programmer read about X" we'd still be stuck with text based OSes. A lot of the innovations PARC came up with (GUI etc) came from bringing in anthropologists and psychologists. Stepping outside your comfort zone is how cool things happen.

rubbingalcohol · on Feb 14, 2015

It's a stretch to say every programmer should know about this stuff when most apps aren't I/O limited. For my purposes, I care that SSDs are fast, not about the underlying architecture of the NAND, page sizes, or drive controller. I also know absolutely zilch about the inner-workings of the Linux kernel and don't plan on reading a book about it.

The nice thing about so much division of interest in the engineering field is that you don't always need to know everything about what's going on under the hood to use a technology to build other cool stuff. Only in specific situations is that knowledge really necessary.

proveanegative · on Feb 14, 2015

>most apps aren't I/O limited.

Is that the case? You hear time and time again that most modern web applications in particular are I/O-bound. I suppose that depending on the particular case the I/O device in question may not be a local disk but that doesn't categorically change it.

phil21 · on Feb 15, 2015

As a web host that focuses on "scale" type sites (alexa top 100), you are absolutely correct.

Out of the 3,000+ servers we've managed, a very small handful are CPU bound. I/O (disk access) is almost universally the issue we work with client developers to fix, and it's extremely rare to run into anything else.

That said, I/O bound apps manifest themselves in many ways. One app could be doing something stupid with serving files, another app may have ridiculous SQL queries, another may be using the wrong technology for a given problem, etc. The universal truth though is that the CPU you toss into a server is almost inconsequential, but your RAM (read: caching) and disk configurations matter quite a bit.

SCdF · on Feb 14, 2015

Not to speak for everyone, but when people say that they mean the network specifically, not the disk.

vidarh · on Feb 15, 2015

I can't remember the last time I worked on an app where it was the network, and not disk IO that was our main limitation.

That is despite using arrays of SSDs on most of our newer servers, and extensively caching things in RAM when possible.

moe · on Feb 14, 2015

No they don't. Network-bound apps are rare, unless your app is e.g. file transfer or media streaming.

The overwhelming majority of apps is disk I/O bound.

nitrogen · on Feb 14, 2015

Apps that make requests to other services are network bound.

patsplat · on Feb 15, 2015

and the reason other services can't keep up is generally io/bound, not network or cpu. sigh.

mathgeek · on Feb 15, 2015

If network-bound apps are rare, then you believe that mobile apps and web apps are rare. Both are, combined, likely to make most other app types seem rare.

moe · on Feb 15, 2015

You seem to be confusing client and server. When people say I/O-bound or CPU-bound they usually refer to servers.

And most clients aren't network-bound either. They are server-bound (i.e. they spend more time waiting for the server to create a response than for the network to transfer it).

mathgeek · on Feb 15, 2015

No, I'm not confusing anything. I can see why you would believe that, based on what I assume you made as inferences.

There are plenty of mobile and web apps that are network bound in this day and age. Many games, for example. That's just the obvious choice of example.

TickleSteve · on Feb 16, 2015

games are more sensitive to network latency, not throughput. The phrase "I/O bound" is typically used in the context of throughput limitations, not latency. The issue of network latency is more complex than simply the network level as it involves the whole software stack in the equation.

moe · on Feb 15, 2015

Games are not normally network bound either.

Your typical FPS shooter, RTS game or MMORPG uses bandwidth in the kbit/s range per player.

mathgeek · on Feb 18, 2015

> Games are not normally network bound either.

Games with a live multiplayer element certainly are. That's my meaning in saying many games (rather than most games).

aftbit · on Feb 14, 2015

I feel like the "summary of the summary" from the comments is the best.

    Thank you for detailing this stuff. I had a quick read.

    For the sake of other readers, if you don’t have time to read Part 6, here is the summary of the summary:

    If you are using a decent OS like linux, as opposed to direct hardware access (who other than google ever did that?), all you need to worry about is how to co-locate your data. By “using”, I don’t mean using O_DIRECT, which basically says, “OS, please get out of my way.”

    If you do use O_DIRECT, do “man 2 open” and search the man page for “monkey”. If that’s not enough, google “Linus O_DIRECT” and look at the war with Oracle and the like. You could probably also google “Linus Braindead” and you are likely to find a post about O_DIRECT.

    If you are one of the few people actually working on the kernel’s disk layer, you probably know all this and is unlikely you will ever be reading this.

    As for co-locating data, there is no way to do it without knowing your app inside out. So know your app. For some apps, it can make orders of magnitude difference. That’s your domain. Leave the disk to the kernel. Let it use it’s cache and it’s scheduler. They will be better than yours and will benefit the whole system, in case other apps need to get at the data too. You can try to help the kernel by doing vectorized read/writes although that’s a bigger deal with spinning disks.

    Ata Roboubi

sliverstorm · on Feb 14, 2015

It might be nice if you italicized that rather than flagging it as code. For regular text, line-wrapping is a good thing.

I realize that's just my opinion though.

arthurfm · on Feb 15, 2015

> I realize that's just my opinion though.

I agree. I also found the text very hard to read.

baruch · on Feb 15, 2015

This guide needs a part on "How an SSD fails" and explanations on what to expect from SSD failures and especially the soft-errors that may occur.

I also can't stress enough that SSDs are born for multi-threaded. If you aren't issuing multiple reads and multiple writes you are doing it wrong. If your data is sequential it is ok to send it in a single operation. If you can however issue multiple commands, do that. You'll have a better chance of getting all of them fulfilled at the same time.

justincormack · on Feb 15, 2015

The main thing is to make sure that there are a lot of commands in the queue, then the drive can take care of most things. SATA has a queue depth of around 32, NVMe much more. If you benchmark (eg with fio) you will see that you get nothing like the rated iops at low queue depths. Getting enough stuff queued is quite hard - fio is useful to test as it gives multiple strategies but you need quite a lot of threads in a real application, or use Linux aio with O_DIRECT.

pmorici · on Feb 15, 2015

Does anyone actually do #27, "27. Over-provisioning is useful for wear leveling and performance" It's been my experience mucking with the innards of flash devices that they already have ~10% more NAND in them than their labeled capacity exactly for this purpose. Seems like over provisioning is bad advice unless you have a very special situation.

wtallis · on Feb 15, 2015

Overprovisioning definitely makes a big impact on the performance of all but the most recent SSD controller architectures out there, but only once the disk is mostly full.

Check out any of Anandtech's benchmarks from the past year or so. They now include graphs showing the consistency of I/O completion times; reserving more than the default ~7.5% makes the GC pauses a lot less severe and makes a huge improvement in worst-case performance. Under sustained load, having more spare area often makes the difference between always being stuck at worst-case performance and always being near best-case.

For example, under a solid random write workload a full Samsung 850 Pro will complete arount 7-8k IOPS, but with 25% spare area it will hover around 40k IOPS. That's a very enticing space/speed tradeoff, especially if you've already decided that SSDs are to be preferred over hard drives for your workload.

The default amount of overprovisioning in most drives is chosen to be roughly enough to allow for reasonable performance and lifespan (affected by write amplification), and in MLC drives usually corresponds exactly to the discrepancy between a binary/memory gigabyte and a decimal/hard drive gigabyte, which simplifies marketing. Drives intended for high lifespan often have odd sizes due to their higher default overprovisioning.

userbinator · on Feb 15, 2015

but only once the disk is mostly full

...because it's basically changing what the definition of "full" is.

xoa · on Feb 15, 2015

Sure in a few cases, it's well known and a straight forward cost/performance tradeoff. Manufacturer internal specified over-provisioning is mainly a matter of economics. Within the technical limits of their particular NAND and controller, they pick a number that will yield the desired performance and longevity for their general target audience of the drive within the right price budget, but there's nothing wrong with tweaking that a bit if someone has different needs. More space can improve performance in some areas, in particular performance consistency and IOPSs, and increase overall drive longevity, but that extra NAND is of course not available for user use. One of the general differences in "enterprise" drives (beyond features like power loss caps) is just plain much higher factory over-provisioning.

It's not bad to know about, it's just another tradeoff case in storage. For scenarios that can benefit from IOPS/consistency or have huge, random loads it may be a very simple way to get a nice bump, particularly out of a "consumer" drive. For simpler loads it's a total waste versus more available storage, or even a negative if it would result in data getting pushed off of SSD onto something slower. The value also can vary from drive to drive too, so it should always get tested.

I agree with you though that the article should have mentioned that, like all tuning, there are no universals (or the manufacturer would have done it already), and that in general for modern SSDs the defaults are just fine unless you've got a specific reason otherwise (and can quantify the result). I suppose many programmers will be generating loads far higher then the typical consumer, but even so I suspect default will usually be the right choice.

bigbugbag · on Feb 15, 2015

The manufacturer may have his own reason[1] to do some over-provisioning, as a user I intentionally leave some ssd space unpartitioned to avoid the terrible performance drop when the SSD gets full.

[1] Remember this microSD article ? http://www.bunniestudios.com/blog/?p=918 I wouldn't be surprised to learn that manufacturers combine controller and over-provisioning as a cheaper and more flexible alternative to a full test of the SSD.

justincormack · on Feb 15, 2015

I used to overprovision hard drives, by not formatting the slower areas on the outside, which had slower streaming rates, when doing video streaming years ago. If I was going to do serious high write loads on consumer SSDs I would consider leaving 10% blank.

dollaaron · on Feb 15, 2015

It's my understanding that most manufacturers recommend overprovisioning of 8% for consumer drives, and 28% for drives used in enterprise.

qprofyeh · on Feb 14, 2015

Also, make backups regularly.

comrade1 · on Feb 14, 2015

I don't think any programmer needs to know more than what this information means practically to a programmer. I don't need the details - I have other things to worry about.

SSD is great because when we compile we are opening potentially hundreds of files at once. SSD is great for parallel access and blows away spinning drives.

Also, as a programmer I replace my computer every 5 years. More often than that is too much of a time sink. I may replace some parts in those 5 years. For example I upped the RAM and replaced the harddrive with an ssd on my 2009 MacBook pro. But in 2015 I purchased a MacBook pro with 16gb ram and a 1tb ssd.

as a programmer I want to focus on the programming.

Edit: sorry, I read the article but didn't quickly understand that this is a summary chapter of a book written for programmers that actually do have to worry about ssd access vs normal hard drives. It makes much more sense in this context.