The article is very impressive, and the author is an experienced graphics developer, which makes me wonder why not do these experiments using the free software graphics drivers in Mesa?
We (the team that writes the Intel driver in Mesa) would love to be involved in things like that and would be happy to answer questions where possible -- #intel-gfx on Freenode or mesa-dev@lists.freedesktop.org.
The documentation doesn't contain much if anything in the way of performance data like instruction latency, so we've implemented similar methods to what the author used to measure the number of cycles a piece of code takes (environment variable INTEL_DEBUG=shader_time). We'd likely even learn something about the hardware ourselves by being involved in experiments like these.
We (the team that writes the Intel driver in Mesa) would love to be involved in things like that and would be happy to answer questions where possible -- #intel-gfx on Freenode or mesa-dev@lists.freedesktop.org.
The documentation doesn't contain much if anything in the way of performance data like instruction latency, so we've implemented similar methods to what the author used to measure the number of cycles a piece of code takes (environment variable INTEL_DEBUG=shader_time). We'd likely even learn something about the hardware ourselves by being involved in experiments like these.