So something I don't understand that doesn't seem to have an explanation in the ...

sbahra · on Sept 18, 2014

1) What's most interesting to us is that with all these spare cycles mean we can start doing some computationally expensive analysis as part of crash reporting and can even do it at scale. However, this type of analysis requires a very efficient tracer. We will provide updates on the latter in an upcoming post.

2) The speed and efficiency of backtrace generation can affect recovery times. It's far more than 50ms for a lot of server-side or embedded applications.

3) Large programs today cannot be debugged feasibly (as in, good luck in generating a detailed crash report, your system will likely not have the resources), especially if they're time sensitive as well (tracing is the typical approach there). There are engineers out there who have to spend hours just to extract a small memory dump from a single thread.

4) Certain classes of bugs are best observed over time and minimizing jitter is important (more on that later as we unveil some features of the advanced tracer and user interface).

Less intrusive real-time profiling is interesting to us, but currently only in the context of state leading to a fatal bug (this also includes bugs that involve hanging such as an infinite loop). The technology does have applications for performance management but this isn't something we are focusing on at the moment.

peterfirefly · on Sept 18, 2014

It matters if it takes seconds to run, especially if the UI around the crash reporting/data gathering isn't too great, which it often isn't.