Once you've identified long response times, it would also be possible to observe...

Once you've identified long response times, it would also be possible to observe that the threads handling requests spent a good percentage of time waiting for CPU, which would point to CPU saturation as the problem. I'm not sure how you do this on GNU/Linux, but you can assess this on illumos with ptime(1) or prstat(1M).

I think your concern about measuring CPU utilization is real, though. You can use frequent sampling and present samples on a heat map to deal with this problem. There are some examples here: http://www.brendangregg.com/HeatMaps/utilization.html