Hacker News new | past | comments | ask | show | jobs | submit login
RAMCloud puts everything in DRAM (zdnet.com)
46 points by ukdm on Oct 5, 2011 | hide | past | favorite | 37 comments



"Imagine a world where data layout doesn’t matter, where apps are optimized for sub-millisecond storage, where 100 byte I/Os are faster and just as efficient as 8KB I/Os. The architectural implications are huge and would take a decade or more to get our heads around."

Umm.. I've worked with in memory data structures (who hasn't?) and yeah, layout DOES matter. Especially if your data structure is larger than a cache line.


>"Imagine a world where data layout doesn’t matter, where apps are optimized for sub-millisecond storage, where 100 byte I/Os are faster and just as efficient as 8KB I/Os.

with latency not being absolute 0, 100bytes I/O will always be less efficient the 8K I/Os. For example, with 300Mb/s and 0.01ms latency the throughput would be 10M/s vs 220M/s.


"100bytes I/O will always be less efficient the 8K I/Os"

Not if only 200 bytes of that 8K are relevant. It depends on the rpc overhead, but the small io's may be more efficient. Which is why you see them researching changes to the networking stack.

"300Mb/s and 0.01ms latency the throughput would be 10M/s vs 220M/s."

When you write numbers on a napkin this is true. When you're talking about real systems that are both concurrent and scheduled in quantized slices you'll see much more complex behavior.


Note, this is from a database literature perspective. The server doesn't need a complex storage management layer like most RDBMs. If the read/write latency is low enough, all it needs is object framing, and clients can pick their own more complex data models atop this.


"Umm.. I've worked with in memory data structures (who hasn't?) and yeah, layout DOES matter. Especially if your data structure is larger than a cache line."

That's only compared to the speeds gained by different memory layouts.

But compared to hard/solid state disks (which is the whole point here) the difference is less than insignificant.


It just itches a nerve to hear "It's a 1000x faster already! Data layout doesn't matter!" when you are thinking back "Well, it's also 1000x more expensive and it could have been 50,000x faster if you had put a bit of thought into your data layout."

I'm a crusty old console game dev. The explanation I give the new kids is: "Remember the PlayStation2? It ran at 300MHz and had a memory latency of 50 cycles. But, the PS3 is faster, right? It runs at 3000MHz and has a memory latency of 500 cycles. You know what happens when you don't think about memory layout? The PS3 runs at the same speed as the PS2!"



Is "DRAM" the new name for "RAM"?

Anybody know why the change in terminology? In which circles is this standard?


Because it's a pretty generic term, so I guess in circles were precision is desired, DRAM is preferable. You could claim to "store everything in RAM" and still be talking about a conventional disk, for example.

It's probably worth making the distinction here, considering they're claiming "maximum performance", implying fastest possible hardware available, which isn't the case: they'd at least be using SRAM if it were.


I think in this case they're using it to explicitly call out that the RAM being used is volatile. If you first mentioned this theory to someone they'd probably assume you're talking about using NVRAM since it would survive a power failure.


They're probably trying to distinguish it from flash "memory" (which is usually accessed using a storage interface).


> Is "DRAM" the new name for "RAM"? Anybody know why the change in terminology? In which circles is this standard?

RAM has always been DRAM. Specifically putting the "D" was probably meant to differentiate it from other types of Random-Access Memory, like NVRAM.


The term DRAM is used to differentiate it from SRAM, which is faster but less dense.


The "D" just stands for "Dynamic". DRAM & RAM are pretty much interchangeable.


One of the driving goals of putting everything in DRAM was "low latency." They want latency of retrieving a small sized (100 byte) object to be as low as a few microseconds, even across a network.


I'm not exactly sure what's being invented here that doesn't exist already.


1.) Reading the position paper should help you with that. It cites some prior literature. In particular the DeWitt paper should be a good jumping off point on citeseer or the like. For that matter, reading DeWitt's papers should cover a great deal of academia's exploration of novel database architectures.

2.) It's a position paper. It's not attempting to assert a novel invention from day one, it's staking a claim about the design space (and further that you should fund us to do research in this space).

3.) The project is ongoing, you can see some preliminary results on their wiki. Most notable are some details on very low latency RPC and very rapid recovery. The recovery work rediscovers the same essential prescription as bigtable but does so via a second sharding scheme, which I believe is novel. Whether it's better is open to debate.


Sorry if I seem cranky, but I'm damn tired of people doing drive by criticisms of "ALREADY BEEN DONE BRO" on hackernews, particularly when they clearly haven't even read the primary sources.

Novelty or originality are not the only requirements for noteworthiness. A great deal gets done confirming prior experience or making relatively modest and obvious evolutionary extensions of previously well known work.


No worries. It was a query of clarification not trolling, thanks.


can you cite an example of when DRAM is used for persistent storage? I am not aware of any.


There are a few products the use primarily DRAM, usually backed by capacitors/batteries and flash.

ZeusRAM SAS drive: http://www.stec-inc.com/product/zeusram.php

RamSan rack-mounted: http://www.ramsan.com/products/rackmount-ram-storage/ramsan-...

Kaminario K2 SAN: http://www.kaminario.com/products/K2-Solid-State-SAN-Storage...


Further examples of combo products:

http://en.wikipedia.org/wiki/IOPS


acid-state, http://acid-state.seize.it/, which was previously known as happstack-state, and before that HAppS-State, has been doing it since 2005 or so.

acid-state currently lacks replication/multimaster support, but happstack-state has had several experimental implementations of that as well.

acid-state is Haskell specific.. but that is part of the appeal. You can directly store fancy algebraic data structures with acid-state. You are not limited to a simple combination of records integers and strings (for example).



SATA2 DDR2 HyperDrive5 64GB: A Solid State DDR drive.

http://www.hyperossystems.co.uk/07042003/hardware.htm


Something like http://www.anandtech.com/show/1742 seems similar.


Did that thing ever actually ship?


Yep, you could buy it new up until a couple years ago.

http://www.amazon.com/Gigabyte-GC-RAMDISK-i-RAM-Hard-Drive/d...


Yes, I have one working OK in a ZFS based server.



Am I mistaken, or is this what Violin Memory, FusionIO, and there is a 3rd company that is doing this already...


That's flash, accessed through a block device interface with 10-100 us latency. RAMCloud is using DRAM with <10 us latency.


It would be cool if they could put everything in SRAM.


At the large sizes they're using, SRAM would be slower than DRAM because of the longer wire delays that its lower storage density causes.


redis?


Nope


This sounds a lot like SAP's in-memory database, HANA:

"SAP HANA is an integrated database and calculation layer that allows the processing of massive quantities of real-time data in main memory to provide immediate results from analyses and transactions."

http://www.forbes.com/sites/sap/2011/10/04/why-sap-hana-is-a...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: