I agree with his rant about "CS departments' algorithmic analysis" failing to co...

fmap · on March 18, 2018

> I agree with his rant about "CS departments' algorithmic analysis" failing to cover these real-world hardware issue. That was certainly true of my ~1997 undergrad CS course at the University of St Andrews, that was otherwise exemplary.

I counter with similarly anecdotal evidence from 2012. :)

While we had a normal "Algorithms and Datastructures" course that taught the classical theory, there was also a separate course about "Algorithm Engineering" which was (among other things) about performance in the presence of the memory hierarchy. The upshot is that there are different cost models that you have to use in the analysis of your algorithms (e.g., IO complexity), but the techniques that you use for the analysis are the same.

---

To be honest, I think the system is working as intended here. Undergraduate courses are not really designed to push people to the forefront of current research. They're intended to give you the basic skills you need to catch up to modern developments yourself.

I think what's happening here is that people remember "optimality proofs" from their algorithms courses and take that to mean that there is nothing more to be done (that's the sentiment I got from this blog post anyway). The problem with this is that in this context "optimality" referred to a technical definition - optimal with respect to some specific simplified model of computation. The result is probably not optimal with respect to a more complicated model!

kilburn · on March 18, 2018

> I don't understand is why his data structure for finding the least-recently-used item wasn't just a linked list

Because he is not talking about expiring the Least Recently Used item, but to expire the items whose "cache time" has expired. That is, when you add a new item to the cache, you add it with some lifetime (e.g.: an integer). Then you have some clock (another integer) and what Varnish wants to do is expire all items whose lifetime is lower than the current clock.

With a (doubly-)linked list you would have to traverse the list on insertion to place the item in the correct spot...

abainbridge · on March 19, 2018

That explains it. Thanks.

lorenzhs · on March 18, 2018

I wholeheartedly agree that many—if not most—algorithms classes don’t sufficiently talk about the limitations of the models used and how to handle this. I’m a big fan of Algorithm Engineering, which is best imagined as a cycle of design, analysis, implementation and experiments, with each stage influencing the next. This has led to some exceptional results (e.g. the route planning algorithms used by Google/Bing/Apple Maps, which are around 5 orders of magnitude faster than Dijkstra’s algorithm by using clever pre-processing that exploits the structure of road networks. Their asymptotic query complexity on general graphs is still the same as Dijkstra’s, iirc)

Nitpick, but your list-based LRU cache’s operations wouldn’t be in o(1) but rather O(1) – little oh means strictly slower growth, in this case sub-constant. Probably just a typo though. And it could be further optimised for locality by using a cyclic buffer instead of a linked list (-> no pointer chasing + sequential access), but see kilburn’s answer (the structure in the paper solves a different problem)

chucksmash · on March 18, 2018

Do you have an article you would recommend on the route planning optimizations? That sounds interesting.

lorenzhs · on March 18, 2018

There’s a great recent-ish survey paper at https://arxiv.org/pdf/1504.05140.pdf which will give you pointers to papers in all directions :)

zombieprocesses · on March 18, 2018

> I agree with his rant about "CS departments' algorithmic analysis" failing to cover these real-world hardware issue.

I disagree. CS algorithms is about the logical/mathematical understanding of algorithms. If you want to understand hardware implications, you are generally taught that in an comp architecture class. Blaming algorithms class for the lack of consideration for real world hardware issues is like blaming the computational class because the turing machine doesn't factor in hardware issues. You abstract it out to make general logical observations.

lorenzhs · on March 18, 2018

I disagree. It should at least cover what aspects should be considered when designing an algorithm for use in the real world. You dismiss the entire field of experimental algorithmics as if it were nothing more than a bunch of algorithmicists who know some computer architecture things, but there’s a lot more to it.

I’m not arguing against teaching classical algorithm design and analysis, they’re both incredibly important imho. But we should stop ignoring the shortcomings of the models used, hand-waving them away with “the rest is just a bit of engineering”. It’s not.

repsilat · on March 18, 2018

There are theoretical models of random-access machines with fixed-size caches, on which data structures like B-trees and others can be (and are) analysed for asymptotic performance. It's usually not taught until late undergrad or graduate level though.

(You may have heard the terms "cache aware" and "cache oblivious" used to describe algorithms. These terms come out of theoretical CS, describing algorithms with good performance on these "RAM with cache" machines when the cache size is known or unknown ahead of time, respectively.)