I noticed this a few months ago with Perforce's p4merge (Qt-based) application and filed a Radar (28890473).
At least in that instance, the bottleneck was rgba64_image_mark_rgb32(). A perfectly fine little bit of assembly byte shuffling, but not fast enough to handle gracefully the billions of pixels/second being thrown at it.
At least in that instance, the bottleneck was rgba64_image_mark_rgb32(). A perfectly fine little bit of assembly byte shuffling, but not fast enough to handle gracefully the billions of pixels/second being thrown at it.