Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The index type for CSR need only index nodes. It's the offsets (one per row) that need to index edges.

The blog's portrayal of NVMe vs distributed memory is a bit off:

> With modern NVMe drives random seeks aren’t slow anymore, much faster than distributed network calls like you do when scaling the linked list-based graph.

Distributed memory machines has 1 microsecond off-node latency and 15-20 microsecond reductions (with thousands of nodes). These latencies have been pretty flat for the past 20 years in supercomputing, and have been widely available at cloud providers for the past few years. NVMe is not as fast as it's made out to be. https://twitter.com/SwiftOnSecurity/status/99079794835421184...

The other issue is that NVMe is more expensive than DRAM or HBM when you care about bandwidth (and graph/network analysis is almost always memory bandwidth limited, even on HBM). Suppose you need to do 1000 traversals of a 1 TB data structure. This takes about 10 seconds with 100 HBM devices ($1 at on-demand pricing, assuming amortized startup), but days of saturating NVMe bandwidth.



Thanks for the correction. I was aware of the speed of distributed cluster reads, but I might be overestimating the speed of NVMe random seeks.

> The other issue is that NVMe is more expensive than DRAM

How are you calculating that?

When I upgraded my home server the other week, 16GB of DDR3-1600 cost nearly as much as 1TB of decently fast NVMe (HP SX920).

Would you still think an on-disk mmap'ed CSR graph library be useful for medium-scale usecases given modern SSDs? Or do you think such latencies kill any use


Unless your data is at rest, you're paying for bandwidth, not capacity. SRAM > HBM > DRAM > NVMe when you're buying bandwidth. Seriously, when we have a low-latency network, the most cost-effective way to run some simulations and graph algorithms is out of L3 on EPYC (where mid-tier parts have over 1 TB/s at a fraction of the energy and cost of an A100/MI100).

Your NVMe quotes 3.2 GB/s, versus 25 GB/s (theoretical) for a stick of DDR4-3200 (which is in the same price range). A standard 2-socket EPYC server can give you 300 GB/s from DRAM, but you'd need several servers servers packed with NVMe to provide that bandwidth.

The rationale for persistent storage (like NVMe) as an algorithmic device is either (a) you need it to be persistent or (b) you have physical or algorithmic constraints that prevent you from using more parallelism and you're willing to pay 5-10x for the privilege of executing from NVMe.


When you say "you're paying for XXX" you mean cloud prices (in terms of $ to complete a similar job)?

I was referring to running it on your own metal, but your argument might hold there.

Obviously from a cloud provider I'd just take a massive pile of RAM rather than NVMe drives.


Cloud pricing is competitive and thus a good proxy for relative on-prem costs if you keep the on-prem hardware busy. If you want to buy the capability to run the job on-prem, but the hardware sits idle most of the time, then NVMe might be optimal. However, you should be aware that these constraints have greatly increased the cost of the job itself.


Yes, I had something mixed up here - you are completely right about CSR matrices, they should not use more memory than an adjacency list.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: