Perhaps they mean it must erase an entire block before writing any data, unlike ...

dragontamer · on June 11, 2021

The issue is that DDR4 is like that too. Not only the 64 byte cache line, but DDR4 requires a transfer to the sense amplifiers (aka a RAS, row access strobe) before you can read or write.

The RAS command eradicated the entire row, like 1024 bytes or so. This is because the DDR4 cells only have enough charge for one reliable read, after that the capacitors don't have enough electrons to know if a 0 or 1 was stored.

A row close command returns the data from the sense amps back to the capacitors. Refresh commands renew the 0 or 1 as the capacitor can only hold the data for a few milliseconds.

------

The CAS latency statistic assumes that the row was already open. It's a measure of the sense amplifiers and not of the actual data.

eqvinox · on June 11, 2021

It's vaguely similar, but there's a huge difference in that flash needs to be erased before you can write it again, and that operation is much slower and only possible on much larger sizes. DDR4 doesn't care, you can always write, just the read is destructive and needs to be followed by a write.

I think this makes the comparison unhelpful since the characteristics are still very different.

Dylan16807 · on June 11, 2021

The difference is that on DDR you have infinite write endurance and you can do the whole thing in parallel.

If flash was the same way, and it could rewrite an entire erase block with no consequences, then you could ignore erase blocks. But it's nowhere near that level, so the performance impact is very large.

dragontamer · on June 11, 2021

That's a good point.

There are only 10,000 erase cycles per Flash cell. So a lot of algorithms are about minimizing those erases.

zymhan · on June 11, 2021

What does DDR have to do with NVMe?

nine_k · on June 11, 2021

You can't write a byte, or a word, either.

The "fact" that you can do it in your program without disturbing bytes around it is a convenient fiction that the hardware fabricates for you.

dragontamer · on June 11, 2021

DDR4 is effectively a block device and not 'random access'.

Pretty much only cache is RAM proper these days (aka: all locations have equal access time... that is, you can access it randomly with little performance loss).

bertday · on June 11, 2021

I’m confused. What’s the difference between a cache line and a row in RAM? They’re both multiples of bytes. You have data sharing per chunk in either case.

The distinction seems to be how big the chunk is not uniformity of access time (is a symmetrical read disk not a block device?)

dragontamer · on June 11, 2021

Hard disk chunks are 512 bytes classically, and smaller than the DDR4 row of 1024 bytes !!

So yes. DDR4 has surprising similarities to a 512byte sector hard drive (modern hard drives have 4k blocks)

>> What’s the difference between a cache line and a row in RAM?

Well DDR4 doesn't have a cache line. It has a burst length of 8, so the smallest data transfer is 64 bytes. This happens to coincide with L1 cache lines.

The row is 1024 bytes long. Its pretty much the L1 cache on the other side, so to speak. When your CPU talks to DDR4, it needs to load a row (RAS all 1024 bytes) before it can CAS read a 64 byte burst length 8 chunk.

-----------

DDR4, hard drives, and Flash are all block devices.

The main issue for Flash technologies, is that the erase size is even larger than the read/write block size. That's why we TRIM for NVMe devices.

bertday · on June 12, 2021

Thanks, I see what you mean at the interface level.

In terms of performance analogy though, hard drives do not get random access to blocks, but RAM does. The practical block size of hard drives is sequential reads of 100kiB+ due to seeks.