> In the author's example, the fastest implementation was actually _not_ lock-free, it merely used a better spinlock than std::mutex.
The author is also using gcc 4.6; perhaps different results would be obtained with gcc 4.8 or recent Clang. If you care enough to think about LFDS, measure measure measure =).
The author is also using gcc 4.6; perhaps different results would be obtained with gcc 4.8 or recent Clang. If you care enough to think about LFDS, measure measure measure =).