Hacker News new | past | comments | ask | show | jobs | submit login

So the whole world should take a 1-2% performance penalty on everything so some users can maybe run a profiler?

Wouldn't it make more sense to just have an 'apt reinstall all --with-frame-pointers' command that power users could run before they wanted to profile something?




I don't know where 1-2% comes from, but for many scale production workloads I studied it was so close to 0% that it was tough to measure beyond noise on the cloud. That's not to say that 1-2% is wrong, but that it's likely someone's workload and other people see less.

Helping people find ~30-3000% perf wins, helping debugging and automated bug reports, is huge. For some sites it may be like 300 steps forward, one step back. But it's also not the end of the road here. Might we go back to frame pointer ommision one day by default if some other emerging stack walkers work well in the future for all use cases? It's a lot of ifs and many years away, and assumes a lot of engineering work continues to be invested for recoving a gain that's usually less than 1%, but anythings possible.

There's a couple of problems with an apt reinstall. One is that people often don't work on performance until the system is melting down -- many times I've been handed an issue where apt is dreadfully slow due to the system's performance issue and just installing a single package can take several minutes -- imagine reinstalling everything, it could turn the outage into over an hour! The other is I'd worry that reinstalling everything introduces so many changes (updating library versions) that the problem could change and you'd have no idea which package update changed it. If there was such an apt reinstall command, I know of large sites (with experience with frame pointer overheads) that would run it and then build their BaseAMI so that it was the default. Which is what Ubuntu is doing anyway.


Eeven 0.1% scaled accross all users is a huge amount of wasted energy. You can profile without subjecting everyone to frame pointers.


Not nearly as much as the missed optimizations from not having this easily available.


[citation needed]

That's just the same handwavey reason as given in the article. Where is the evidence that this will actually result in any significant amounts of optimization that wouldn't be possible without making everything (slightly) slower for end users.


> Our analysis suggests that the penalty on 64-bit architectures is between 1-2% in most cases.

Right from the article. I find it a difficult subject, as a developer/poweruser I am happy to see framepointers. But I can not speak for others.


Many systems take various kinds of performance hits in return for things all the time; reliability, observability, safety, etc. Many systems can be run at higher peak throughput in return for various instabilities, even. Performance is not actually a uniform number across the system. You're looking at an aggregate, but changes like this can make it much, much more practical to diagnose specific performance issues for users in specific scenarios, which may have extremely large impacts far beyond 1-2%. That's very important in practice especially when users can often feel those outliers, e.g. why does this application enter a spinning state and suddenly burn CPU for 1 minute before returning to normal.

> Wouldn't it make more sense to just have an 'apt reinstall all --with-frame-pointers' command that power users could run before they wanted to profile something?

I don't see why it makes any more sense than just changing the default that the distribution uses. For one it's way more work, maintaining another copy of everything for a ~1% performance difference is not an obviously good tradeoff for the distro teams to make. Not to mention it often isn't possible to do this in the cases people want it i.e. they want to continuously profile an existing production system that they can't just run apt on willy nilly.


I've seen how instantly-available profiles affect the engineering culture on practice and it's transformative. The difference between "yeah strange, I'll deploy an fp build some time later and check... maybe" and "see this thing right here on the flamegraph" is huge and often repays 5-15x of the initial 1% slowdown.


This x1000

This "1%" loss will never manifest. It will be pure gain.


Single digit percentages are noise in moore-units. Layout has a bigger effect. So many "optimizations" in our tech culture are around removing the headlights and brakes so that the car can go slightly faster in the dark, on hills.


Good analogy.

Stack traces and good performance profiles are table stakes to even starting to make good software imo


FTA

> I’ve enabled frame pointers at huge scale for Java and glibc and studied the CPU overhead for this change, which is typically less than 1% and usually so close to zero that it is hard to measure.


This is very hyperbolic and inaccurate.

In 2023 it's not a 1-2% performance penalty anymore and certainly not for most use cases. Only if the 15th register is critical for performance on an x86_64 CPU.

Certain workloads might suffer more, but most will certainly suffer less than a 1-2% hit.


Using any of the higher 8 registers on an x86_64 requires an opcode prefix and makes your instruction 1 byte longer. There is still a small reward for avoiding r8-r15.


very allegorical


You are prematurely optimizing.

"can make use of this improved debugging information to diagnose and target performance issues that are orders of magnitude more impactful than the 1-2% upfront cost."

Also, can't you get reliable stack dumps when something goes wrong too?


You can get reliable stack dumps without frame pointers.

Removing needless instruction and register pressure that 99.99% of users don't rely on in any way and the rest don't need if they fixed their tools is not premature optimization but simple common sense. Which is why its on by default in the first place.

Calling 1% or even 0.1% optimizations that apply accross the board "premature optimizing" is a great example of the culture of wastefulness that has made computers less responsive even though hardware has gotten a million times faster. These things do add up.


They already take performance hits left and right with all those containers running all over the place, parsing JSON.


>So the whole world should take a 1-2% performance penalty on everything so some users can maybe run a profiler?

If so, I'm all for it. The win from easy access to profiling can dwarf this 1-2%


That's assuming you can't profile without frame pointers which is simply not true. If some tools have issues then fix them.


No, it's only assuming that you can more easily (with less overhead) profile with frame providers. Which is simply true.


At the age I would think the observability and the debugability are the qualities that I'd like the systems to have. The productivity gains r immeasurable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: