Not necessarily: video output is ultimately limited by the display. If the display runs at, say, 60Hz and both drawing on an off-screen buffer and compositing together take up less than ~16ms, the result should be exactly the same as drawing directly on the front buffer.
The main problem is that modern GPU rendering is “pipelined”, so it's entirely possible to have a drawing operation that takes 16ms and a compositing operation that also takes 16ms, and still have your application running at 60FPS, albeit 1 frame "behind" the input. Most developers are not aware of that. (Including me, until recently. I learned about this while trying to figure out why my VR application felt "wobbly", despite running at the recommended 90FPS) The HTC Vive ships with a really neat tool for visualizing that: http://i.imgur.com/vqp01xn.png
This assumes you are synchronizing the updates with the monitor's refresh cycle, however if you aren't (and the major reason you see lag in compositors is because they do such synchronizations) then composition is indeed slower since it involves several more moving parts and the need to orchestrate the refresh of multiple windows (as opposed to the instant "i want to draw on the screen now" model that X11/Xorg without a compositor and Windows without DWM use).
Yeah, having to synchronize multiple windows is probably a pain. I guess that's a much smaller issue with a VR application (the OpenVR compositor supports overlays, but they're not used that often, and there's a clear priority to the "main" VR content)
I guess a valid approach would be to double buffer all windows on the compositor side, and render the "stale" buffer of any window that fails to update within a specified frame budget (16ms - expected compositing time), that way at least well behaved apps would have no noticeable latency. There would probably need to be some level of coordination with apps that already do their own double buffering, not sure how that's currently handled. Perhaps a hybrid approach between compositing and direct rendering is also possible, where different screen regions get composited at different frame rates. (Should work as long as there's no transparency involved)
Compositors already do that, you render into a compositor managed texture and the compositor simply uses whatever is there so applications can update at their own leisure.
... and when you give people direct access to the front buffer, they write code that tears or generally scans out incomplete renders and users end up blaming the operating system.
Compositing is a good thing, and in the vast majority of cases its latency isn't actually intrinsically higher than writing directly to the front buffer. Certainly its intrinsic latency is never higher than writing directly to the front buffer if you build a system without visual artifacts. (Because at the end of the day, all compositing does is shift around who does the job of putting things on the front buffer; the jobs themselves stay the same for all practical purposes.)
But i want the tearing, or at least i prefer it to the latency that compositors impose! This is why compositors must not be forced and instead be a user option. I do not see why i have to suffer a subpar computing experience because of some clueless users.
I even force vsync off system-wide where possible (that is in Windows, in Linux i haven't seen such an option and even in Windows DWM ignores the setting).
I believe Wayland has made some latency improvements.