Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hyper-threading is a misnomer imho. It should have been called "poor man's threading".


It's a marketing term, so clarity wasn't the goal, but it did make some sense in that it's not just threading in software, but enables the hardware to run multiple threads sometimes. Don't forget that it was primarily introduced in CPUs that were single-core.


Letting multiple threads share execution resources is a good strategy for enabling wider execution resources without underutilization if a thread hits a latency-bound section or a branch. It's wasteful to have resources sitting unutilized in one core while another thread could be running.

Like you are totally welcome to turn off SMT on most AMD and Intel motherboards (including some workstations I've seen etc - they even have "one core only" modes if you wish!). But it's a performance benefit in most situations, and it's good performance benefit relative to the cost/etc (compared to twice as many cores/cache/etc). It's just higher PPA than a non-HT design, in compute-limited scenarios.

I actually would be curious to see it on the Apple silicon too! They seem to have a very wide frontend/backend and maybe it could do a bulldozer-style switching the decoder between threads.

But in general there is indeed the sort of latency/QOS vs throughput tradeoff there. Server processing architectures go even higher on resource sharing - Sun Niagara series went to 8-wide CMT times 8 modules (8C64 threads). That's totally fine for some database workloads and they've specifically optimized their software to run well on it and utilize all the threads. And it's cheaper in your licensing too, wow how unpredicted!

POWER9 went to SMT8 on the performance cores as well. If you want to build a big fat core, it's hard to keep it utilized, and the inherent tradeoff is... just more threads.

Xeon Phi (Knight's Landing) is an interesting precedent in this thesis too! It kind of takes the bulldozer idea even further and does SMT4 and also AVX-512, but you get 54 of these P6Pro-tier SMT4 cores with AVX-512 bolted on. I know people view it as a descendant of Larrabee but it's interesting as a wide-SMT parallel processor as well, it's broadly comparable to something like Niagara in some respects.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: