In Figure 2 depicting 2.5D warping, why is 246C a relevant temperature? In my experience with IBM Power and Z systems, north of 85C is considered the high end already. What am I missing?
The idea that ARM solves power issues seems a bit overblown and driven by Apple purchasing the most advanced nodes sooner than competitors, causing people to associate the efficiency gain with the architecture even though a lot of it is the silicon process.
Edit: Also Apple can and does do away with legacy support which is less about the architecture and more about expectations of what is included vs. deprecated. The push to ARM seems like it will result in a loss of support and control in the users' hands (converting expectations of PCs into those of smartphones) but now I'm really into speculative territory...
>causing people to associate the efficiency gain with the architecture even though a lot of it is the silicon process.
They also cause people to associate higher performance with the architecture, when high-performance ARM is something more or less unique to Apple.
Qualcomm (as in, best case for performance that you can actually order a tray of 1000 CPUs from) is several years behind what Apple/Intel/AMD currently offers and as such is simply not competitive with x86 (even in performance/watt); that is one of three reasons why there are very few ARM laptops (the second one is that Qualcomm refuses to do drivers properly, and the third is that their price/performance ratio is positively abysmal). Sure, they're getting better with the acquisition of ex-Apple engineers through Nuvia, but there's still quite a ways to go before they're actually competitive.
ARM isn't a magic "make my computer faster" button; chip designers are doing the heavy lifting for any architecture.
I don't have numbers, just a hint-motivated question of how much improvement Apple's silicon has seen post-M1 compared to the competition. I've been lead to believe x86 has improved more in the same time period, but I'm too far removed from the details to back it up.
The efficiency cores actually aren't the primary thing that makes M-series chips so power efficient, even though they help. Even when you're ripping big data apart the chips do it faster than nearly anything else on the consumer market, while managing to be far more efficient as well. It's not just the process node, it's stuff like decoding multiple instructions in parallel being simpler due to the architecture.
That's overly simplistic. It is the whole ecology of optimising for low-power usage on mobile that makes ARM so efficient. An advanced node does make a difference of course, but it is all the other choices made for efficiency that stack up e.g. big.LITTLE and the OS support for it. You can still get power efficient ARM on older nodes.
Yeah this is sort of what I was trying to allude to with legacy support but I guess I should've generalized to their control over the ecosystem and efficient interaction between components.
Well, what I want specifically is an M-series Mac Studio equivalent CPU perf/watt I can pair with a dedicated GPU, running linux without jumping through hoops.
If I was to pick a new CPU right now (and I won't, because I don't feel limited by my 2017 CPU still..., anyhow), I would just pick _not Intel_. AMD is also doing a great job on the perf per Watt department without throwing away decades of compatibility. Intel slowly catching up is something to be looking forward to as well.
> “The base leakage is lower than the previous technology, but the overall total power is higher,” said Melika Roshandell, product management director at Cadence. “So at the end of the day, your thermal is going to be worse because you’re packing a lot more transistors into one IC and you’re pushing the performance.
So why not match the previous gen's TDP and just use whatever gains you get from more efficiency?
Back in the 90s 300W was considered extreme for an entire system, today just a single 4090 can pull a full kilowatt. This constant increase manufacturers have been pushing to compensate for diminishing returns is unsustainable and completely absurd.
AMD lets you do that, and it's probably recommended in some cases. They don't force you to (except in some laptops?) because would give the user less freedom and because it would leave performance on the table. It does mean that the default boost settings have the chip running at a bit less than 100 Celsius on any standard air cooler.
I think of that as a win. Heavy software stacks create a demand for faster hardware and the market provides. Those of us with use cases that represent a tiny marketshare, but actually need the performance can do so with light software stacks.
So the question is, how can we make the overweight software stacks even slower?
By going back to using interpreted languages. No more precompiled binaries where changing/adding patches and flags require full builds, just stream compiled/compressed source code directly to the target.
You could also extern the software to a webserver and only cache small parts of the software on the client. Instead of gigabytes of install sizes you'll be required to have a 1 or 10 gbit network link to run and stream features as you're acessing them.
Make it “industry standard best practice” to provide devs with absolute top end cutting edge hardware “for productivity and efficiency gains”. Repeat this at every opportunity at conferences and industry events until it becomes ubiquitous. Profit as devs slather abstraction layers, trivial libraries, emulation layers etc. on because “developer time is worth more than CPU time” and “it’s not slow on MY computer”. :D
There's a strong rationale for making software developers work on older, limited machines, rather than the latest and greatest flamethrowers. (Of course, that doesn't apply to me.)
I’m curious about what is happening in the field of reversable computing. I haven’t heard much about it at all in the last 20 years. Basic information theory tells us that it takes energy to destroy information, so building ALUs that limit destroying information seems like a bit of a no brainer for attempting to create lower power (and heat) computing. The basic premise is that for any operation that loses info you store enough bits to allow the operation to run backwards. If you were clever about it, you could design your chips to only destroy bits in places that can be effectively as well. I’m sure reality gets in the way of pure theory, but I was sure that people were spending enough effort on the concept to have heard more about it.
Reversible computing doesn't help wrt. static or leakage power, which is an increasing fraction of power draw in recent device nodes. It does help wrt. dynamic power, and can be implemented via charge-recovery logic - but this comes at a cost in area which increases static power.
I have worked on this aspect on quantum computation [1]. The main problem is that reversibility is a feature of isolated quantum systems. In practice, they are not isolated.
Why? It's not just because of small interactions with the environment that we cannot control. It's that even the apparatuses that we use to control/drive the logical instructions (lasers, electrical transmission lines) should be taken into account if the computer is to be considered isolated. But usually they aren't, and this leads to inevitable losses of reversiblity in the data register.
In other words, unitary (reversible) operations do not come for free.
I think that in quantum computers it is more likely that energy-efficiency will come from some sort of algorithmic advantage.
Interesting article overall.