Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And for smart compilers, developers have to have intimate knowledge about the implementation details in order to work around infelicities, or understand exactly how much performance they're leaving on the table when it does or does not do something. So what happened to all that abstraction you were clamoring about, where dumb compilers require "understanding the machine" -- when for the 4th time this week you're staring at the generated code from your "smart compiler", wondering why it's leaving performance on the table? I still write assembly and it isn't because I'm working on a Z80, or for a lack of "very smart compilers" -- I assure you. Sometimes the compiler just can't do what I want; other times it's being too clever for its own good and getting in my way. Most of the time it does a perfectly good job.

The abstraction is fundamentally leaky. As someone who worked on a compiler for years, cost models are important, and benchmarks and perceptions are very important -- but they are often non-intuitive and not totally "free", as TINSTAAFL says.

That said, optimizing something like a self-pow call to a square is probably not overreaching as an optimization or whatever. But there is certainly something to be said about over doing it and before you know it, you've kicked the can down to <person in the hallway who's really good at optimizing on platform X> because they had to do it anyway because only they know why the JVM or whatever behaves that way on a Friday with 17 inner class methods or something and you're really sure it can be faster but you've got a deadline in like a week and holy shit you have 3 other things to do.

As an aside, I think if any compiler gets the "most sane cost model award", it would probably be Chez Scheme, since it is both unbelievably fast, and based entirely around a practical metric: basically no optimization is added if it cannot speed up the compiler itself. Performance should be predictable both in terms of runtime and compile time. Since the compiler is a rather general purpose application -- if the optimization isn't worth it there, it's probably too overly specific, and likely isn't carrying its own weight. After 30 years of following this rule -- Chez is very, very fast.



> developers have to have intimate knowledge about the implementation details in order to work around infelicities

If you presume this, we would still need that knowledge for simple compilers too and the hardware.

Regardless, I disagree with this. Even though the abstractions are leaky, not everything is leaked. For example: Algorithms that have tight loops to do things that fit in cache will do better than those that loop over massive sets. This only leaks the cache size and little else.

You claim to develop compilers, so maybe I am out of my depth. When I read blog posts or watch videos, like those from the latest cppcon. They all boil down to the simple advice "write idiomatic code and don't try to be more clever than the problem requires". Then when that performance isn't good enough profile and optimize. This is working well for me in 3d games. When I see fit to look at my assembly all of it makes good direct sense, and when it doesn't it is extremely fast.

So looking at the previous example leaking the cache size; A slightly smart compiler might insert extra instructions to tell some CPUs how to prefetch whatever is next. A very smart compiler will detect data dependency and rework the loop to operate on a data set that fits in cache (maybe even querying the size of the cache). The alternative is for the dev to know the cache szie at the time the algorithm is written and take it into account there. This is a simple example but shows the problem, why should the dev care about the cache size, when it can be automatically accounted for?

Are there some more nuanced examples where the simple pattern of write idiomatically then profile fails?

I happen to have of those videos[1] running in the background right now about abstractions that cost nothing. I will start it over and pay more attention with your added details in mind. I will also look into Chez Scheme this is the first I have ever heard of it.

[1] CppCon 2016: Serge Guelton “C++ Costless Abstractions: the compiler view" - https://www.youtube.com/watch?v=q0N9Tvf7Bz0&index=108&list=P...


Idiomatic coding is ultimately premised on "following the grain of the compiler", with the language as a front-end abstraction to the compiler's various technologies(type checks, optimizations, etc.) - it can be seen as a way of minimizing bottlenecks that would occur when doing something that the compiler can't optimize well or the language can't express well. And for a lot of application code, that's sufficient. You write simple constructs, they get optimized away, everyone's happy.

However, for a compiler writer, it isn't sufficient. They're in the business of being the bottleneck! As such, they quickly enter a space where they have to set down their own principles for what idiomatic code looks like and how much they allow abstractions to leak upwards, negotiating those principles against the ecosystem of real-world code.

A consequence of this is that a compiler that only has to operate on a single codebase has it easy, because all its optimizations can be customized to the problem domain and target devices, while more generalized compilers have a dynamic optimization problem, where some features help some of the time, and not others. And their clients are all going to be programmers who get themselves into big trouble and then want the compiler to solve the problem.


The Chez Scheme rule is also used by Niklaus Wirth with respect to the actual language design, and it shows in the evolution from Pascal to Modula-2 and Oberon. All have extremely fast compilers, although some implementations expend more effort on optimization than others.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: