Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OP claims it shouldn’t be “difficult to detect (...) because the hardware is working” because most commercially sold host controller chips would generate interrupt and report errors, unless Facebook is using something nonstandard that don’t.


The hardware is reporting the errors to the kernel but not crashing the system. It's "difficult to detect" because unless you are specifically monitoring for those stats, the only issue you'll see is degraded performance on an occasional machine (assuming you are watching carefully enough to even discern the performance delta). Some of the error counters are even predictive of an issue rather than something that is actively impacting performance. The FB software is basically scraping those messages and bus stats into JSON that can be consumed by their monitoring infrastructure.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: