Hacker News new | past | comments | ask | show | jobs | submit login

Yes, you waste a lot of memory. Memory's cheap. If you need to, you can do the computation in blocks. Well, it's pretty rare to actually run out of memory but blocking is useful for staying at a lower cache level.

Scalar languages have a different problem: the default is to process one value at a time, which wastes the potential parallelism that array languages take advantage of with SIMD algorithms. Because this is the status quo, it's not as easy to see this as a big concern. And the solution is also blocking. That is, the best algorithm for SIMD-friendly stuff is generally a blocked array method.

In practice, whether an array language is good or not really depends on the specific problem. Of course, for most practical uses performance doesn't matter at all; I believe k's reputation mostly comes from kdb being fast as a database rather than k implementations being fast languages. But array languages can be surprisingly fast just by focusing on elegant array algorithms rather than machine-specific considerations. I have comments and benchmarks comparing with C here:

https://mlochbaum.github.io/BQN/implementation/versusc.html




Memory is cheap, but memory bandwidth isn’t.

Languages that can stay in L1 cache for the duration of a computation will run circles around a language that explicitly computes and stores all intermediate values in full.

Also, array-based languages can easily hit the wall of system memory capacity whereas traditional code tends to be streaming and can handle unbounded input lengths.


Which is exactly why I said you block the computation to stay at a low cache level. With SIMD loads and stores I don't think this matters quite as much as you suggest, even without blocking. It's pretty much only arithmetic that can saturate L1. I timed the BQN compiler on various files (some old version of itself, repeated). For 18K it runs at 21.4MB/s; for 1.7M, 16.5MB/s; for 17M, 12.0MB/s. So even when the source won't fit in L3 (mine's 8MB) the degradation is under a factor of 2 (and of course the compiler makes no consideration of cache, who writes a megabyte of BQN?).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: