I love AMD and I am super impressed with the performance of this new CPU. Especially when comparing to top offering from Intel. But, how did they manage to catch up on Intel? Is there a catch? How did it go so wrong for Intel?
As far as I've understood, it's complicated but I think some or all of the following reasons are involved in Intel starting to lose its foothold on the market:
- AMD got a new very competent CEO (Lisa Su) that managed to save the company and turn its fortunes around;
- In the last decade, Intel seemingly forgot what its core businesses were, and instead invested lots of its time and resources to try to break into new markets such as smartphone CPUs and modems (failing miserably at that);
- AMD is fabless; they got rid of their expensive fabs in the '00s by spinning them off as GlobalFoundries. When TSMC perfected its 7nm node, they just jumped manufacturers;
- Intel has lost the fabs lead it had years ago; I still remember them releasing 22nm chips while AMD was still on GloFo 45nm. Intel has been stuck on its 14nm node for years, and they haven't been able to release 10nm chips until last year;
- AMD saw a clear market interest for CPUs with a higher core count, and they managed to deliver a "good enough" CPU that appealed to buyers in a stale market. The Ryzen R7 1700 had slightly lower single-threaded performances than Intel, but it offered immensely better multicore performances than its Intel equivalent for a comically low price point, on cheap motherboards and with its stock cooler;
- Due to almost complete lack of competition from AMD for almost a decade, Intel got greedy and CPU prices skyrocketed; in 2017 an 8 core HEDT CPU from Intel was sold at a price close to $1100, which is almost as much as a whole R7 1700 setup costed. AMD had nothing to lose and no risk of market cannibalisation, so they could just price their CPUs at much lower price point than Intel;
- Ryzen is arguably a good design, and AMD's idea of making high core count CPU by interconnecting small quad-core CCXs instead of making huge dies like Intel Xeons was a bet that really paid off. Operating systems already more or less supported NUMA-like architectures, and AMD got from this a very scalable architecture. Like Threadripper has shown, they can just "glue" CPUs together; the final result is a much bigger package and a huge socket, but nobody really cares about that.
I wonder if it's also fair to say that Intel was living on borrowed performance as it relates to the Spectre style vulnerabilities. IIRC, addressing those vulnerabilities was less impactful for AMD.
> But, how did they manage to catch up on Intel? Is there a catch? How did it go so wrong for Intel?
I have no idea if this is true: this is just my theory.
AMD struck it out of the park when they started using a chiplet architecture. It's really great for 1 socket and is pretty good for 2 sockets.
The thing to understand about chiplets is: each chiplet is a separate CPU that has to perform cache coherency. The cache coherency traffic increases as the square of the number of CPUs, so you just can't have that many talk to each other at one time.
Intel dominates the multi-socket server market and I think they wanted to maintain that dominance, so they just didn't want to go down the chiplet road, as it would really hurt their performance on 4+ socket systems.
The result, though, is that in 1-2 socket systems, AMD has a big price/performance advantage over Intel, a big part of which is due to the way chip yields work.
Well, AMD is led by a highly accomplished semiconductor physicist (who seems to possess great leadership ability as well) which can't be bad. Intel apparently took a few chances that went the wrong way with its 7nm manufacturing process.
As an outsider it's hard to say much else without a lot of speculation.
You can't keep milking the old cow that can't calve.
AMD made a very good bet on Intel not sending Skylake to the butcher, and instead reacting with a lot of "*lake" updates.
Intel needed a quick patch to stop bleeding clients, but, ultimately, that decision to not to commit to a brand new architecture doomed them long term.
According to IIRC Lisa Su from AMD, they always plan with the assumption that Intel executes well. When Intel then doesn't do that, you get the current situation.
And more specifically you lose by not doing that, which is exactly what is happening to Intel and why they have such a hard time responding : by killing amd for a while but not being their own competitor, well, they forgot how to compete. Same thing that doomed the atom line and low power cpu (and why windows on arm is a thing).
Same as the last time: https://en.wikipedia.org/wiki/Jim_Keller_(engineer). Curiously Jim is at Intel now. So if Intel lets him do his thing, they'll be un-fucked in the near future as well, after which Jim will depart to unfuck something else.
There's another completely different theory. I'm not saying it's strictly true but it's worth thinking.
Moore's law has ended around 2016 or so. As progress is slowing, CPU technology is commoditizing. Successive Intel CPU models aren't much faster than previous ones anymore. So all the manufacturers will just end up in the same place and will only compete on price.
We will end up running everything on very efficient cores, massively parallelized. The software frameworks for that will be very different from current ones.
Software is changing very slowly, the multiprocessing shift in hardware is a 20 year old phenomenon and yet only niche apps are parallelized outside gaming. Maybe in another 20 or 40 years.
I think it'll take a lot more happening in programming language technologies than Rust or the current fragmented and janky set of mostly proprietary GPU languages.
The tricky thing about parallelizing workflow apps is that most workflow happens in a pipeline, which fits single cores really well.
It's tricky to find problems that fit into grids.
Pictures are grids, And for computationally intensive tasks that are isolated gpus work greate. But you still have to find a way to make your sums work in a grid if they don't.
Last time I was writing OpenCL the ability for threads to chat with each other was at the entry and exit points which made fitting some problems through parallel architecture very hard.
I think most trouble today comes from the low level imperative & error prone way of expressing computation, it's just prohibitively hard to ensure correctness with today's languages.
If we continue with the aaa games example, we can see that on consoles programmers do manage to extract parallelism even from cpu side work, it just requires a lot of work, debugging grit, tools, talent and budget along with a specific application with limited parameters.
We just need to make this kind of work easier and PLT seems to me the solution.
It's actually not a categoric no. There are usecases where rr does work fine, albeit a patch is needed to make it recognized (source: I've used it successfully on multiple occasions). It doesn't work for big workloads like Firefox yet, though.
Interesting how compilation benchmarks do not scale at all from 24 to 32 to 64 core Threadrippers. Neither Linux kernel nor LLVM compilation. I wonder why it's so bad.
At some point when you build LLVM.with the default options, you hit a moment where most of what's left to do is linking a number of executables, all statically linking libllvm.a. They are all relatively slow to link, and there aren't enough of them to fill 32 or 64 cores. And they require a lot of RAM, so if you don't have enough RAM, you might end up swapping too.
Showing some curiosity might help. When you learn to profile and dig deeper, you will be able to understand what's going on, troubleshoot it and find the bottleneck(s). I remember the kernel compiling in sub-10 seconds on Egenera's multi-million dollar quad-socket blade racks a decade ago. Given the computing power, memory speed, bus speed, SSD performance and code-size today, there's no material reason similar figures shouldn't be attainable today.
That energy usage is below the specified TDP is surprising. Some other recent processors - especially from Intel, but to a lesser extreme also some AMD models - happily went way above their specifications.