One of the biggest problems with OpenGL currently is the lack of a good binary intermediate representation of shaders. Direct3D has a byte code representation that is vendor neutral. OpenGL has APIs that let you cache off binary shader representations (glProgramBinary), but they are configuration specific. Configuration differences may include hardware vendor, hardware version, driver version, client OS version, etc. So in practice these formats are only useful when the shader files are compiled on the client machines. They don't actually allow developers to ship only compiled shaders unless they are comfortable with their shaders not running on future hardware, for example.
This leads to developers pursuing various less optimal solutions that all involve more startup time for users and less predictable performance and robustness for developers (at least when compared to the solution D3D has offered for more than 10 years). So when people say OpenGL is years behind D3D this is one of the things they mean. D3D isn't perfect here either. There is a fair amount of configuration-specific recompilation going on, but the formats are more compact than the optimized/minimized GLSL source formats people are pursuing on OpenGL and while the startup time (shader create time) is still too long, it is still much better than OpenGL. Shader robustness is generally more predictable and better on D3D but it's hard to disentangle shader pipeline issues from driver quality.
To be fair multicore is also an issue for OpenGL, but D3D isn't great at that either. The current spec for D3D11 includes a multicore rendering feature called "deferred contexts" but performance scaling using that feature has been disappointing so it isn't a clear win for D3D. Other APIs (e.g. hardware-specific console graphics APIs) expose more of the GPU command buffer and reducing abstraction there allows for a real solution to the multicore rendering problem. There should be a vendor neutral solution here, but so far neither of the APIs has delivered one that is close to the hardware-specific solutions in performance scaling.
This leads to developers pursuing various less optimal solutions that all involve more startup time for users and less predictable performance and robustness for developers (at least when compared to the solution D3D has offered for more than 10 years). So when people say OpenGL is years behind D3D this is one of the things they mean. D3D isn't perfect here either. There is a fair amount of configuration-specific recompilation going on, but the formats are more compact than the optimized/minimized GLSL source formats people are pursuing on OpenGL and while the startup time (shader create time) is still too long, it is still much better than OpenGL. Shader robustness is generally more predictable and better on D3D but it's hard to disentangle shader pipeline issues from driver quality.
To be fair multicore is also an issue for OpenGL, but D3D isn't great at that either. The current spec for D3D11 includes a multicore rendering feature called "deferred contexts" but performance scaling using that feature has been disappointing so it isn't a clear win for D3D. Other APIs (e.g. hardware-specific console graphics APIs) expose more of the GPU command buffer and reducing abstraction there allows for a real solution to the multicore rendering problem. There should be a vendor neutral solution here, but so far neither of the APIs has delivered one that is close to the hardware-specific solutions in performance scaling.