Invariant (Constant) TSC is detectable via `cpuid` and applies to `rdtsc/rdtscp` by default. In that aspect, there's no tradeoff being made there (observable to software) AFAICK.
Are there cheaper ways of getting elapsed time with sub microsecond precision? Interested as I've only ever heard of rdtsc at the lowest level in userspace for x86.
I ran across a random stackoverload thread with benchmarks claiming it was about 2x the cost of doing a naive gettime(). But frankly, hard to figure out when you factor in all the various caches, OO execution pipelines, etc,