Out of curious, does BPF now capable of capturing all the context switch events such as CPU trap?
Also, if the overhead is negligible, maybe the author can try to merge this into mainline with the use of static key to make the incurred overhead switchable. In spite of the static key, the degree of the accompanied inteferences on cache and branch predictor might be an intriguing topic though.
Edit: Perhaps an alternative approach would be to attach probes to relevant (precise) PMU events. There's also this prototype of adding breakpoint/watchpoint support to eBPF [1]. But actually doing stuff within this context may get complicated very fast, so would need to be severely limited, if feasible at all.
Also, if the overhead is negligible, maybe the author can try to merge this into mainline with the use of static key to make the incurred overhead switchable. In spite of the static key, the degree of the accompanied inteferences on cache and branch predictor might be an intriguing topic though.