Hacker News new | past | comments | ask | show | jobs | submit login
Intel Meteor Lake Architecture (hothardware.com)
108 points by SandraBucky on Sept 25, 2023 | hide | past | favorite | 112 comments



Will be interested to see how this first(ish) gen of Intel's disaggregated chips pan out. I've been needing to replace my laptop and these seem like they have the potential to be extremely nice for a mid range machine with long battery life. The new scheduler hierarchy is especially interesting given how much of the physical chip they can avoid powering on at all for most simple tasks. For a lot of light use cases the entire "real" CPU and GPU parts of the silicon can be completely dark since the SOC has two tiny cores to run things and other necessary parts things like the video decode silicon were separated from the GPU.


Eh, I have a sneaking suspicion the compute dies won't be shut down as much as you'd think, and that there will be some extra power usage from crossing the dies like desktop Ryzen parts (though hopefully not nearly as severe).

A good Process Lasso config is probably worth the time investment. Instead of "trusting" the scheduler, you could force everything non time sensitive onto the efficiency island, maybe by default.


The 3D Foveros packaging technology is critical as it allows some path lengths to be much shorter than if you had to traverse that same path but only in the horizontal 2D plane.

Very excited to see how this plays out in practice.


I thought Meteor Lake was tiled and 2D? Intel has EMIB and such for very good bridges, but they are still bridges.

If it is 3D stacked with TSVs, thats a whole other can of worms. AMD's X3D on Ryzen 7000 creates heat/clockspeed issues, and they reportedly canceled a 3D variant of the 7900 GPUs due to similar issues.


Disaggregation is also good for risk management of attack surfaces.

> Intel has broken out the Silicon Security Engine from its traditional Converged Security and Manageability Engine (CSME).

Good to see Security separated from ME.

Hopefully the ME no longer has control over IOMMU isolation of devices.


I'm not sure I follow. It's almost guaranteed that all chiplets are still on some global system bus just like they were on their monolithic dies. Unless Intel has taken sudden great strides with their SoC security architecture, there are likely still all the old problems (plus a bunch of new fun ones!). Taking it off die is a response to the physical scaling issue, it's really not meant as a security enhancement.


Per the article, both IP blocks remain within the same SOC chiplet/tile.

https://en.wikipedia.org/wiki/Intel_Management_Engine

> Starting with ME 11, it is based on the Intel Quark x86-based 32-bit CPU and runs the MINIX 3 operating system.

If "Silicon Security" functions are now executing on a separate CPU/OS, it's an increase in separation from Intel ME functions.

https://www.anandtech.com/show/20046/intel-unveils-meteor-la...

> [Meteor Lake] introduces the Intel Silicon Security Engine (ISSE), a dedicated component focused solely on securing things at a silicon level ... The Converged Security and Manageability Engine (CSME) has also been partitioned to further enhance platform security.


> The “South” IO fabric is ordered, but non-coherent and PCIe-based. It is home to Wi-Fi and Bluetooth, PCI Express connections, Sensing, USB 3/2, Ethernet, the Power Management Controller (PMC), and Security controllers.

Does "Sensing" refer to human presence based on camera and radio (Wi-Fi, UWB) imaging?

https://lkml.org/lkml/2023/2/12/314

  Intel Visual Sensing Controller (IVSC), codenamed "Clover Falls", is a companion chip designed to provide secure and low power vision capability to IA platforms. The primary use case of IVSC is to bring in context awareness. IVSC interfaces directly with the platform main camera sensor via a CSI-2 link and processes the image data with the embedded AI engine. The detected events are sent over I2C to ISH (Intel Sensor Hub) for additional data fusion from multiple sensors.
https://www.techpowerup.com/276114/new-intel-visual-sensing-...

  The company didn't detail how it goes about this, but technologies already exist to combine visual input from the PC's cameras; radio from the PC's antennas, audio from its mic array; to form a picture of its surroundings.
https://community.intel.com/t5/Blogs/Tech-Innovation/Client/...

  With an initial focus on respiration detection, we hope to extend the technology to detect other physical activities as well. Intel Labs will demonstrate an early prototype of breathing detection ... The solution detects the rhythmic change in CSI due to chest movement during breathing ...  The respiration rates gathered by this technology could play an important role in stress detection and other wellness applications.


Interesting to see how efficient these are for office/coding (e.g. typing into vscode) tasks. Will the cpu tile be off most of the time or will it take some years before applications and OS are tuned to avoid cpu tile wakeups.

Also how good will the p-cores be compared to previous gen?

Are the avx-10 instructions going into this generation?


> Are the avx-10 instructions going into this generation?

Nope. In fact, it probably won't be in client until after Lunar Lake (2025+).


Per the AVX10.1 Instruction Set Reference, p. 1-2 (355989-001US rev 1.0)[0], AVX10 support will begin with 6th gen Xeon processors (based on Granite Rapids), which are due next year. Client support is not called out, so your guess of Lunar Lake (late 2024-early 2025) is probably a good guess.

[0]: https://cdrdv2.intel.com/v1/dl/getContent/784266


No, the instruction set supported by Lunar Lake has been published by Intel a few months ago and it is almost the same as that of Arrow Lake S, i.e. without AVX10 or AVX-512 support (Arrow Lake S supports more instructions than Arrow Lake, e.g. it has SHA-512 secure hash instructions).

It is likely that Panther Lake, to be launched in 2025, probably in the second half of the year, will be the first Intel CPU supporting the 256-bit subset of the AVX10.2 ISA version.

The 2024 Granite Rapids will probably be the only Intel CPU supporting the AVX10.1 ISA version (with full 512-bit support), because all the following will start from AVX10.2.


AVX10.1 is just Sapphire Rapids' AVX-512 renamed, so arguably SPR (and early Alder Lake) already support AVX10.1, just without declaring the relevant CPUID bits.

Client support depends on E-cores supporting it, and Intel have specified that they'll start with AVX10.2. I don't believe that any core has been announced with AVX10.2 support yet, and we the latest we know is Lunar Lake's ISA support.


AVX10.1 is just Granite Rapids' AVX-512 renamed.

Granite Rapids has a few extra AVX-512 instructions (including those added by Tiger Lake, but omitted in Sapphire Rapids and Alder Lake), so Sapphire Rapids does not support all of AVX10.1. Therefore neither Sapphire Rapids nor Emerald Rapids may turn on the AVX10 CPUID bits.

Nevertheless, the differences between the AVX-512 instruction sets of Granite Rapids and Sapphire Rapids are small and of little importance.


Are you sure? I just checked Intel's manual, and nothing above Sapphire Rapids is listed in AVX10.1: https://files.catbox.moe/23ty0y.png

I couldn't find any new AVX* instructions added to Granite Rapids (I see PREFETCHI and some AMX additions, neither of which fall under the AVX category), and VP2INTERSECT isn't listed under AVX10 or Granite Rapids.


I still wonder how Apple was able to achieve such an incredible performance per watt ratio compared to Intel and AMD. Anybody knows how they let Apple do it?


A few reasons.

1. Arm is generally more efficient than x86. 2. Apple uses TSMC's latest nodes before anyone else. 3. Apple doesn't chase peak performance like AMD and Intel. CPU speed and power consumption is not linear. Intel has been chasing 5GHZ+ speeds the last few years which consumes considerably more power. Apple keeps their CPUs under 3.5GHZ.


> Arm is generally more efficient than x86

This is not entirely true in general sense. Yes, a typical ARM CPU is more energy efficient indeed, but theoretically nothing prevents x86 to be nearly as efficient.

The main reason why Apple silicon is more efficient is that Apple silicon is a mobile chip basically, and competition on mobile is harsh, so all the producers had to optimize their chips a lot for energy efficiency.

On the other hand until apple silicon and recent AMD ascension there was a monopoly of Intel on a laptop market with no incentive to do something. Just look at how fast Intel developed asymmetric Arm-like P/N-core architecture right after Apple Silicon emerged. Let's hope this new competitor will force more energy efficient x86 chips to be produced by intel and amd eventually.


> This is not entirely true in general sense. Yes, a typical ARM CPU is more energy efficient indeed, but theoretically nothing prevents x86 to be nearly as efficient.

The very complex instruction set does. You can easily throw multiple decoders at Arm code, but x86 scales badly due to the variable length. Current cores need predecoders to find instruction boundaries which is just not needed with fixed width instructions and even then can only decode simpler instructions with the higher numbered decoders.


> Current cores need predecoders to find instruction boundaries which is just not needed with fixed width instructions

The question is, how much overhead does it cause compared to the whole picture. There are empirical evidences the answer is "very little":

https://chipsandcheese.com/2021/07/13/arm-or-x86-isa-doesnt-...

> With the op cache disabled via an undocumented MSR, we found that Zen 2’s fetch and decode path consumes around 4-10% more core power, or 0.5-6% more package power than the op cache path. In practice, the decoders will consume an even lower fraction of core or package power.


> The very complex instruction set does.

i.e., PSPACE ⊆ EXPTIME

https://en.wikipedia.org/wiki/EXPTIME

which is funny because people are always like "uh why do i need to understand asymptotics when machines are so fast". well the answer is the asymptotics catch up to you when the speed of light isn't infinite or when you're timing things down to the nanosecond.


Arm is practically as complex as x86... It supports multiple varieties (e.g. v7, thumb, thumb2, jazelle, v8, etc), lots of historical mistakes, absurdly complex instructions even in the core set (ltm/stm), and a legacy that is almost as long as the x86. It even has variable length instructions too...


Many of which were dropped for 64bits ARM.


Only jazelle and thumb v1 are dropped from most v8 non-ulp cores, and then only half dropped: they still consume decoding resources (e.g. jazelle mode is actually supported and the processor will parse jvm opcodes, just all of them will interrupt). We are stuck with the rest as much as intel is stuck with the 8087: It is about time they could do some culling, but not without backlash.


I stand corrected, thanks.


I'm not sure this holds. X64 decodes instructions (which is awkward) and stores the result in a cache, then interprets the opcodes from that cache. So the decoding cost only happens on a cache miss, and a cache miss on a deeply pipelined CPU is roughly game over for performance anyway.


> Apple doesn't chase peak performance like AMD and Intel.

Intel and AMD also make low-power parts.


But they don't make high-end and performant, low-power parts (yet).


One big thing is that Apple has (almost) bought out TSMC's N3 node, so they're the only one with chips made on the most advanced manufacturing process available.


It's difficult to compare because honestly most reviewers just suck at making meaningful comparisons.

You can't compare a chip running at 3ghz with one running at 5ghz. It just doesn'tell you anything useful about the architecture, only what the company configuring the chip thought mattered.

Being "only" 30% faster but using twice the power at 5ghz, for example, is entirely expected. Chances are the M1 couldnt even run that fast, or it would end up using just as much power if it did.


Intel would squash an internal project like that, or drown it in politics. You could sit here all day with examples of "why did big company let little company become successful"


Apple's market cap is currently 20x Intel's market cap. Is Apple supposed to be the "little" company?


Little-ish? PA semi was only 150 people and acquired for < $300 million back in 2008. Intel's market cap was 150 billion back then. Impossible to say how PA semi would have fared, but as a division, it's still way smaller.


But PA Semi wasn't close to Intel in 2008 when it had 150 people.


The latest mobile amd zen 4 has a comparable efficiency* to apple m2 despite not being arm or having hybrid architecture. See 7840U.

* Within up to 15% at 25W.


This is not true.

Maybe in light threads that utilize many cores.

Most reviewers base it on Cinebench which is a poor indication of CPU performance for anything except Cinemark. Cinebench uses Intel Embree Engine which is hand optimized for x86. In addition, Cinebench favors CPUs with many slow cores - which is not how most software will perform. This is why AMD heavily marketed Cinebench for Zen1 launch and why Intel heavily markets it now for Alder Lake/Raptor Lake. In fact, Intel's little cores are basically designed to win at Cinebench.

Furthermore, AMD CPUs will rate at 25w but can easily boost up to 40w+ watts. It's up to the laptop maker.


You can easily limit the power to 25W. Most manufacturers typically have a silent mode which does exactly that.

Not sure what you mean by many slow cores, since mobile zen 4 has a better single-core performance than m2 pro.


Zen4 desktop has - at the expense of much higher power consumption.

Zen4 mobile does not have higher ST performance than M2 series.

https://browser.geekbench.com/processors/amd-ryzen-7-pro-784...

https://browser.geekbench.com/macs/mac-mini-2023-12c-cpu

7840u's ST is slower by 21% while consuming much more power during the test.


I don’t know where to begin… There is a lot of material on the internet that is relevant to answering that.

What do you mean “how they let Apple do it”. Do you think Intel & AMD could stop them?


Well, in purely military terms, technically Intel and AMD are only a few miles from Apple and their engineering corps is likely far larger. They could all march over there with broadswords if they really wanted to.


The circular design of the HQ makes sense now.

https://www.reddit.com/r/castles/comments/4t5w0q/round_vs_sq...


Completely off-topic, but: I think the state of the art in castle design (pre modern explosives anyway) was a star/bastion[1], since that allowed defenders to have overlapping firezones, especially useful once an attacker reaches the walls. With a circular design like Apple's HQ, as attackers get closer to the walls fewer and fewer defensive positions can see them until you can only see them from right above.

1: https://en.wikipedia.org/wiki/Bastion_fort


In all likelihood Intel would attack from the middle of the circle...


Clearly the move is to put all AMD and Intel engineers on the inside of the circle. That way they would be visible from all locations on the ring at all times.


A 'reverse Trojan horse'? The defenders sneak the attackers in rather than the attackers trying to sneak in?


That sounds right.


I mean, how didn't Intel and AMD saw what apple was creating.

PCs have been stuck to 3/4Ghz for more than 15 years, so it is not like they didn't have the time to optimize from the consumption/heat point of view.


It's kind of the opposite: Intel and AMD are burning power racing to 6 GHz while Apple targeted a more efficient 3-4 GHz.


Intel basically hit the clock speed limit and diverged to multiple cores. However, they still make x86 based chips, not ARM. They owned an ARM license for a while and got rid of it. For whatever reason, Intel felt like putting all there money on x86 was their only option. For a while they were making Atom chips for mobile, but at some point that design was hobbled because Intel has always been about the 60%+ margins on server chips. You cannot sell the cheaper chips at the same margins. It's not that Intel couldn't technically figure stuff out, it's that they couldn't see past those 60% margins.

For a while Intel's process knowledge was supposed to be better, even if the design was less efficient, but that turned out to be a mirage around 10nm or so. Intel now without a process advantage is probably never going to regain it's monopoly, and so far hasn't really transformed itself to do anything other than build those high-margin chips.

Once upon a time, I wanted to use one of the chips from a company they bought in networking, but Intel's model is to make the chip and let other companies make a product to take it to market. Intel doesn't want to make a market, just sell into it. You can see that with their attempt at TV where they stopped when they didn't want to spend money on content. So the chip I was interested in didn't get much R&D or a product and it more or less disappeared, another wasted investment.


It's a good name. Why didn't they call it Crater Lake tho?

https://en.wikipedia.org/wiki/Crater_Lake


In case this isn't a joke centered on expectations that this will crater, there are many Intel chipsets with "-Lake" codenames:

https://en.wikipedia.org/wiki/List_of_Intel_codenames


No, it wasn’t. I didn’t even think of that.

But your list of Intel codenames is fantastic. It’s a great resource. Lots of beautiful names in there. Still, I wonder why they haven’t gone with crater lake such a beautiful lake and right in their wheelhouse in terms of geography.


Mostly interested in the npu. Most npus are pretty useless for cutting edge ai


Any word on how many TFLOPs that NPU could achieve on the high-end chips?


Intel is taking two pages from the Apple ARM book: smaller cores but bigger caches (for more performance and less power) and main memory on the chip (for more performance and less power).


>Meet Meteor Lake’s Tiles

isn't "Tile" is basically chiplet? why not just call it chiplet?


Best guess is that someone in marketing thinks calling them Tiles will make Intel look better because people won't realize they're just following behind AMD in this respect.


AFAIK this will be the first chip where multiple processes are combined into one die, at least for consumer devices. AMD's chiplets use separate dies from multiple node processes all on one substrate so maybe they don't want to confuse it with that.


It isn't on one die; Meteor lake is 4 dies on a substrate.

It's probably a lawyer thing like uncertainty over who owns the noun 'chiplet'.


The interconnect between dies doesn't rely on the substrate though right?


The interposer is the only way the dies can communicate.


Not exactly. The tiles have different functions. AMD's chiplet approach glues homogeneous chips together and uses a center chip to control everything.


So a more advanced and feature rich version of Ryzen's IO die, with dedicated silicon for AI of course.

Can't wait for Microsoft and Intel to team together to make an ultra AI search bar that can finally find files properly like back in Windows 7...


I just want a search that shows what I'm looking for when I've typed the first three characters of the search term (as, e.g. the windows start menu does now), but still shows that result when I type the 4th character before my brain processes the fact that the result is there (you know, since my responses aren't that fuckin fast) and all the results change up.


I choose to believe this behavior somehow drives ad revenue to Microsoft and is not incompetence. Otherwise, why would they throw away the previously working behavior?


> search bar that can finally find files properly like back in Windows 7

I don’t think that quality is ever coming back. No matter what, they’re going to be connecting to bing for the top results / ads, so you’ll always have a bunch of latency and will never get back to Win 7 levels of local only performance.

It’s sad and the AI, which is mostly useless based on my experience, is going to suck up even more CPU cycles and add even more latency.

For me, it takes 5 seconds for the start search to respond on first use. My 12th gen i5 with NVMe storage and Win 11 literally runs worse than my 4th gen i7 with a first gen SSD and Win 7.

Microsoft has usurped a decade of computing gains and spent them on ads and tracking. Don’t expect anything that benefits the user in the near future.


There already exists a fast serarch tool reasleased by Microsoft themselves, called PowerToys Run.

https://learn.microsoft.com/en-us/windows/powertoys/run


Nah, just go straight to Everything.


It's an embarrassment that sub-second feature-rich file search isn't built in to Windows.

Fortunately there's a truly excellent third-party utility that is probably the second thing I install on any new Windows install (after Chrome): https://www.voidtools.com/support/everything/


I think the Windows Shell Team (hey we got RAR support recently) just withered on the vine when the grand idea of a query able file system build on top off SQL Server in Post XP Windows called "Cairo" collided with the memory/CPU limitations of the time.

My desktop now has 24 cores (8P/16E) and now is the right time to rethink the OS.


> now is the right time to rethink the OS.

Microsoft is definitely doing this, but they're putting all the effort into making it into an attention-stealing ad delivery platform.


Yep. I just bought the latest AMD hotness in laptop form. After giving the abomination known as Windows 11 a spin for a few days since it's installed by default, for the first time ever I'm running Linux as my daily driver and I couldn't be happier.


Does an OS really matter anymore ???.

For me it just there to make sure my PC boots - get's regular patches for security issues and not corrupt my file storage and play sound/videos.

The bulk of time is spent inside a JetBrains IDE/Visual Studio Code/Chrome - my interaction with the OS is just launching programs and hitting the shutdown button and Windows 11 is just great at that.


I think it matters. W11 OOTB is a horrible ad and telemetry ridden mess. You have to spend ages ripping out/disabling that rubbish, and it can suddenly turn back on after an update.

For a work OS, I want sane OOTB defaults, or a "disable all of this" option in the initial setup. These settings should be respected and not overridden in an update.


Obviously I think it does. Operating systems and the windowing environment should help you be efficient and get the maximum use out of your machine. Windows no longer does this; it has now primarily become a vehicle to resell Microsoft services and nag you ad nauseam. Even ChromeOS has less ads nowadays.


They've improved support for E cores on W11, though why not just have gotten a 7950x and avoided the whole mess...


AM-5 boards are too expensive, and you need DDR-5 memory (DDR4 64GB was half the price of DDR-5) here.

I could carry nothing forward from my old PC.


>It's an embarrassment that sub-second feature-rich file search isn't built in to Windows.

It's not built in, but it exists:

https://learn.microsoft.com/en-us/windows/powertoys/run


"Everything" should be a standard on every Windows computer. I've found files that I thought completely lost to the ether, including actual Ethereum after I had lost my key deep in my file directories after an accidental drag and drop.


Every machine I get my hands on gets Search Everything and Terra Copy. I usually start new machines by installing some stuff through https://ninite.com because Windows still doesn't have a proper package manager.


Chocolatey and Microsoft own winget


Often I think some of that stuff is strategically made to be just good enough to discourage competition and so it never actually becomes good enough to be mainstream.

Look at how WinGet was launched with just enough effort to kill AppGet. It was a big announcement that was the equivalent of “avoid this space or we’ll crush you” and then what? Nothing innovative has happened since they killed the innovator (AppGet).


What exactly was the innovation of AppGet, that wasn't already in Chocolatey?


Instead the AI will be made part of the unkillable core "security" services and actually be used to find ways to reroute Windows telemetry around DNS blockers, autoconnect to all smart appliances in the house and teach the dog to report on your most intimate habits.


In win11 I am unable to even find apps (properly installed via signed msi) by typing it's full name.

Searching for setting screens is also a pain in the ass, especially if you use different language. MS recognizes only their own translation, not the most intuitive text, not English text ... you just have to know


In case folks don't know about Everything[0], it is so truly excellent.

[0] https://www.voidtools.com/


> ultra AI search bar

Or you could use literally any other search program that works wonderfully, without the indexing process using an eyebrow raising amount of CPU? Including Microsoft's own shockingly fast file search in VSCode.

> feature rich version of Ryzen's IO die

The interconnect Intel is using is more expensive/sophisticated than AMD's (but less expensive than the TSVs for the X3D chips), so hopefully its pretty good in laptops?

AMD's IO die setup burns tons of idle power, which is why the laptop parts are still monolithic.


FYI that VSCode search is powered by ripgrep.


Well MS should pull ripgrep into Windows, as its indeed hilariously fast.


Windows 95 had a powerful search dialog that could search by file size/date ranges/content. Perhaps we could have that back, but with AI enhancement?


I would even settle for just 'back'.


At least a few times every week I have to pop open a cygwin window to run "find / -iname xxxxxxxxx"

Finding a local file with an already known name is nigh-impossible with the current Windows search.


I think MS is beyond redeeming and there's little reason to stick with windows at this point.


I wish Intel would switch to a new code-name scheme. There's been enough "Lakes".


I kinda like it. Some are cool like ice lake. Others quirky like coffee lake


At least with “Foo Lake” you immediately know it’s an Intel CPU architecture. That a valuable feature. No reason to burn any bridges. ;)


And the Foo Lake refresh would be Bar Lake?


Well you've got Lunar and Arrow on the horizon. Allegedly Nova and Panther after that, but that's just Tech News speculation. There doesn't appear to be any sign that they're ditching the naming convention.

I'm more partial to the FPGAs. Sundance Mesa is a cool codename.


Last week Intel has confirmed that Panther Lake is the code name of their desktop CPUs that will be launched in 2025 and which will be made using the Intel 18A CMOS process.

Therefore this is no longer a speculation.

It can be assumed that these are the first CPUs that will implement the 256-bit subset of the AVX10.2 ISA, finally extending the coverage of AVX-512 to all Intel products, but with the restriction to 256-bit registers and operations.

Panther Lake will be preceded by Lunar Lake for low-power mobile devices, either in late 2024 or in early 2025.

Before that, there will be Arrow Lake S for desktops and Arrow Lake for laptops. Despite the single name, these 2 implement different instruction sets and they will be made with different manufacturing processes, Intel 20A for Arrow Lake and an undisclosed process for Arrow Lake S (which might be made at TSMC, according to rumors, because it needs bigger tiles than what would be possible in Intel 20A).

While Meteor Lake seems to be just a shrink of Raptor Lake, which will have a much better performance only due to the greater energy efficiency provided by the Intel 4 process and due to the much better GPU made at TSMC, Arrow Lake is expected to introduce a new improved CPU microarchitecture, which is supposed to compete successfully with Zen 5.


Or some semblance of order, like alphabetical, so you have some idea of timing / progression.


Amen. Most are incremental upgrades, so all the codenames seem superfluous.


>There's been enough "Lakes".

Yeah, they need to move to rivers now...


Just as long as they don't go chasing waterfalls, and stick to the above like they used to.


The article (or Intel) do not disclose up to how many cores that new architecture is designed for, and I am certain Intel would say something like "With our P-, E-, LE-cores designed architecture(tm) the core count does matter anymore".

Also the SOC with built-in AI engine. Oh boy, I wonder how long it will take for AI-assisted malware, or botnets to emerge. Exciting times!


It's just 6P + 8E + 2IO (ultra efficient) cores or less. Looks it's primary targeting laptops.


Sounds more like targeting is more of "Apple Silicon is kicking our asses, and this is the best we could do"


Again, Intel's target market is very different.

They are using off the shelf cores that have to be good in everything from netbooks and industrial boxes to server workloads. Apple, meanwhile, is laser targeting high volume, premium, media heavy laptop-ish TDPs and workloads. And they can afford to burn a ton of money on die area, a bleeding edge low power process, and target modest clockspeeds like no one else can.


this is such a weak argument. just because it's not in a laptop does not mean that a CPU should be accepted as being a horrible waste of electricity. making datacenters as efficient as laptops would not be a bad thing. i'm sure people operating at the scale of AWS and other cloud providers would be beyond happy to see their power bills drop for no loss in performance. i'm guessing their stockholders would be pleased as well.


Datacenters are actually exactly as efficient as laptops.

They consume more only because they do not stay idle, like laptops.

The CPU cores in the biggest server CPUs consume only 2.5 W to 3 W per core at maximum load, which is similar or less than what an Apple core consumes.

The big Apple cores are able to do more work per clock cycle, while having similar clock frequencies and power consumption to the server cores, but that is due almost only to using a newer manufacturing process (otherwise they would do more work while consuming proportionally more power).

The ability of the Apple CPU cores to do more work per clock cycle than anything else is very useful in laptops and smartphones, but it would be undesirable in server CPUs.

Server CPUs can do more work per clock cycle by just adding more cores. Increasing the work done per clock cycle in a single core, after a certain threshold, increases the area more than the performance, which diminishes the number of cores that could be used in a server CPU, diminishing the total performance per socket.

It is likely that the big Apple cores are too big for a server CPU, even if they may be optimal for their intended purpose, so without the advantage of a superior manufacturing process they might be less appropriate for a server CPU than cores like Neoverse N2 or Neoverse V2.

Obviously, Apple could have designed a core optimized for servers, but they do not have any reason to do such a thing, which is why the Nuvia team has split from them, but they were not able to pursue their dream and then they went back to designing mobile CPUs at Qualcomm.


> i'm sure people operating at the scale of AWS and other cloud providers would be beyond happy to see their power bills drop for no loss in performance

- The datacenter CPUs are not as bad as you'd think, as they operate at a fairly low clock compared to the obscenely clocked desktop/laptop CPUs. Tons of their power is burnt on IO and stuff other than the cores.

- Hence operating more Apple-like "lower power" nodes instead of fewer higher clocked nodes comes with more overhead from each node, negating much of the power saving.

- But also, beyond that point... they do not care. They are maximizing TCO and node density, not power efficiency, in spite of what they may publicly say. This goes double for the datacenter GPUs, which operate in hilariously inefficient 600W power bands.


It's all tradeoffs. Desktop users are happy for 20% more performance at 2x power draw - and they get the fastest processors in existence (at single thread) as a result.

Data centres want whatever gets them the most compute per dollar spent - if a GPU costs 20k you bet they want it running at max power, but if it's a 1k CPU then suddenly efficiency is more important.

It's all tradeoffs to get what you want.


Data center CPUs are already optimized for power and have huge die areas and cost a lot, just like apple silicon.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: