Ryzen Z1's Tiny iGPU

dannyw · 2024-02-27T00:28:42.000000Z

My pet gripe with AMD and Intel is that their APUs / gaming chips always over-index on CPU performance, and under-index on GPU performance.

I get it, it’s a CPU/SoC. But the end users care about FPS. Dedicate more of your die area to GPUs, and you’d crown the competition.

Steam Deck, and the PS5/Xbox chips are the sole exceptions; I wish those chips were available to consumers to buy.

Give me a quad core, or hexa core at most, with top of the line integrated graphics. The extra 4 cores are wasted die space for gamers.

Tuna-Fish · 2024-02-27T01:30:32.000000Z

The lead APU of each generation usually wouldn't get meaningfully faster if you just spent more silicon on the GPU instead of the CPU.

The fundamental problem here is that GPUs require more memory bandwidth than CPUs, the platforms are created for CPUs, resulting in a pretty hard cap on APU GPU performance using the same platform. Generational GPU performance gains on APUs happen when the platforms update to faster memory.

The big break in this will be Strix Halo, coming from AMD either next year or late this year, as it will support 256-bit LPDDR5. Even using the fastest LPDDR5 available on the market, this will still just barely match the memory performance of Radeon 7600, the weakest current-gen discrete GPU AMD has on the market.

AnthonyMouse · 2024-02-27T09:18:51.000000Z

> The big break in this will be Strix Halo, coming from AMD either next year or late this year, as it will support 256-bit LPDDR5. Even using the fastest LPDDR5 available on the market, this will still just barely match the memory performance of Radeon 7600, the weakest current-gen discrete GPU AMD has on the market.

You can do better than that simply by putting some HBM or GDDR6 on the APU itself. Then you don't need to change the socket or have to add memory channels, and you don't need that much of it because you still have the ordinary memory slots to provide some DDR5 where the OS can keep all its dreck bloat. 8-16GB of the stuff would be enough to rival any of the current midrange discrete GPUs.

DinaCoder99 · 2024-02-27T16:08:09.000000Z

> this will still just barely match the memory performance of Radeon 7600

Wow, I had no idea integrated GPUs had come this far. This is fantastic news.

rowanG077 · 2024-02-27T01:57:42.000000Z

And even then memory channels is not really a marketable term for OEMs so not much is done to get that in. Only enthusiast (and thus really expensive) laptops go with a large number of memory channels.

wtallis · 2024-02-27T02:03:16.000000Z

There aren't any enthusiast laptops with more than a 128bit memory bus for the CPU, and memory channels and bus width are the wrong metrics to compare discrete GPUs where both width and clock speed vary widely across the entire product line.

rowanG077 · 2024-02-27T02:51:59.000000Z

This isn't at all about discrete GPUs, where did you get that from? This is about APU/iGPUs. In general memory channel are expensive so you won't see a new product with 8 channels running 3000Mhz memory speed or something like that.

Also Apple laptops do have large memory busses. The Apple M1 Max chip for example has a 512bit memory bus.

wtallis · 2024-02-27T03:06:06.000000Z

I assumed when you said "Only enthusiast laptops go with a large number of memory channels" you intended to refer to something beyond just the MacBook Pro, because "enthusiast laptop" is usually used to describe a category that definitely includes a lot of laptops that aren't MacBook Pros, and only arguably includes some MacBook Pros. And if you did mean to refer specifically and only to MacBook Pros, there were plenty of more straightforward ways for you to do so.

rowanG077 · 2024-02-27T03:28:14.000000Z

It's quite simple really. Macbook Pro M1 Max is in the set of enthusiast laptops. You said there are none. I gave that as a counterexample. I wasn't specifically thinking about it in my original comment. In fact I wasn't specifically thinking about even a bus width that I consider "large". It was a general comment that memory channels are not cheap to have in a piece of silicon and that consumers generally don't care about it as a metric of performance.

jorvi · 2024-02-27T02:57:25.000000Z

I don’t know if this goes for VRAM, but I know that with system RAM, applications care very little about data rate and very much about timings. This includes games.

I do know that overclocking VRAM (or at least GDDR6) barely increases performance, but that’s mostly because it’s usually already been pushed to the edge at stock. My old 3070Ti could easily overclock 10-15% higher, but the memory had to self-correct significantly more errors so the actual improvement was a wash.

I’m also curious if gaming handhelds would benefit from switching their entire memory architecture over to VRAM, like both consoles have done in this generation.

ben-schaaf · 2024-02-27T03:46:18.000000Z

Timings are much less important for GPUs than bandwidth; the cores are slow and doing the same work on similar memory. A 3090 has ~270ns of memory latency - <70ns is normal for a CPU. See https://chipsandcheese.com/2021/05/13/gpu-memory-latencys-im...

jorvi · 2024-02-27T07:02:16.000000Z

Maybe I’m misunderstanding the article, but there’s multiple places where they mention memory latency being the main bottleneck. At any rate, thanks for sharing! Interesting stuff.

lynguist · 2024-02-27T09:00:14.000000Z

No, it’s really bandwidth. Big GPUs have on the order of 800 GB/s memory bandwidth and big CPUs only 80 GB/s! Except for Apple. They really need to up the bandwidth.

GPU performance is proportional to both bandwidth and compute.

blackoil · 2024-02-27T00:58:00.000000Z

How big is the market? Anyone even little serious about gaming will get a dedicated GPU and they can't beat it no matter how much space they use. It will have weaker cpu that will hurt in many tasks. x86 isn't big in mobile form factor.

3abiton · 2024-02-27T01:15:10.000000Z

If you look right now at the homeserver market, it's getting larger and larger. With APUs, it could be a great bargain for a self-hosted PC console with decent steam deck like performance. Linux is getting spoiled with steam compatibility. I have one of those iGPUs, with around 15W idle, and their performance is surprisingly good for an old APU, on low settings for AAA games. You can check YouTube for some comparison.

blackoil · 2024-02-27T08:44:42.000000Z

Home Servers need not be very small. If gaming is primary objective, you can add a 3060/70 or similar, or old gaming laptop, otherwise like you said current iGPU also serve purpose. OPs idea is to sacrifice CPU cores for iGPU, i.e. a home server with weak CPU but powerful iGPU, that is a small niche.

c0pium · 2024-02-27T01:41:01.000000Z

> …homeserver market, it's getting larger and larger.

Is that true? How are you measuring it?

3abiton · 2024-02-27T08:29:36.000000Z

Proxy kpis, like how many tech YouTubers are pivoting to the "check my home server you guys", and the popularity of home server desktop hardware (on the shopping sites I follow)

diffeomorphism · 2024-02-27T07:58:44.000000Z

There are somewhat opposing goals between home servers (e.g. movies, many drives, low power consumption, stowed away) and gaming console-like pcs.

A mini pc can be a good compromise but not the best at either.

nfriedly · 2024-02-27T02:15:47.000000Z

FWIW, I have a desktop with an RTX 2070 and a handheld (GPD Win Mini) with a 7840U, which is essentially the same chip as the Z1 Extreme. I do nearly all of my gaming on the handheld these days. The desktop is way more powerful, but the handheld is powerful enough, and way more convenient.

ngcc_hk · 2024-02-27T11:38:10.000000Z

Have gdp4 and older winmax2-2022 and both run fine with fs2020. Hence unless I need to do Ai which need nvidia, it is all good for low resolution games.

DinaCoder99 · 2024-02-27T16:09:52.000000Z

> Anyone even little serious about gaming will get a dedicated GPU

This is just not true—I do the vast majority of my gaming on my laptop and only switch to the dedicated GPU for super graphics-intensive games like city skylines.

As usual, this attitude conflates "FPS-obsessed entitled divas" with "gamers"

JonChesterfield · 2024-02-27T11:34:45.000000Z

It's not clear to me that discrete GPU's are inherently faster.

Historically yes. You got 16gb or whatever on the card near the compute core. So what if you put 16gb of the same memory on the package with the APU instead?

Shout out to PCIe for unhelpful latency characteristics and barely adequate atomics support as my primary objection to discrete GPUs.

It's also plausible that games will become less demanding of the state of the art in GPU. If a playstation APU can play them beautifully, maybe a higher spec PC one will manage the same despite the OS and libraries overhead.

I think dedicated GPUs are probably on their way out.

The_Colonel · 2024-02-27T06:15:14.000000Z

I think the handhelds prove there's a market for mobile gaming. Thin and light 13" laptops would be a step up from those handhelds.

I would be interested also because AMD has better support in Linux and their mobile GPUs are hard to find in laptops.

izacus · 2024-02-27T09:24:23.000000Z

There is a market for mobile gaming, but machines like Razer Blade 14 and ASUS G14 are successfully cornering that market with powerful discrete GPUs. There's no big reason to manufacture iGPUs when a separate chip is much more performant and easier to cool.

There's not that much demand for a mediocre-to-poor GPU that's more powerful than iGPUs of current era.

The_Colonel · 2024-02-27T14:03:44.000000Z

Most gamers game on mediocre to poor GPUs, which are still more powerful than the current iGPUs. My assumption is that such APU would be cheaper than the equivalent CPU + discrete GPU and there's a large demand for budget gaming (e.g. students).

Ultimately we will see if/when the Strix Halo gets released, how priced it will be and how successful. I would likely buy a laptop with it.

danbolt · 2024-02-27T01:19:36.000000Z

I usually buy game consoles instead of fussing around with components, but I’d be incredibly tempted with something as convenient as that.

jorvi · 2024-02-27T03:07:18.000000Z

You can put together an ITX / SFF system with:

- AM4 motherboard

- 5800X3D

- Aftermarket low profile CPU cooler

- 3600MHz 16GB RAM

- RTX 4060 Low Profile

- 2TB NVMe SSD

And it’d be smaller and much more powerful than a PS5. If you get the parts secondhand, it wouldn’t be that much more expensive either, especially once you factor in PS+ and how cheap Steam games are.

The only hairs in the soup are SteamOS and Nvidia drivers.

SteamOS doesn’t yet have a standalone install. You can work around that by installing ChimeraOS and having it launch Steam Big Picture on passwordless boot.

Nvidia currently has relatively poor open source drivers, but nvVLK is rapidly improving, with probably good performance in a year or so. Your other option is hoping AMD or a board partner release a low profile card, but I don’t see that happening. Or the dark GPU horse that is Intel.

You could also go for a larger case, which would allow you to fit a bigger GPU. Something like an AMD 6800 XT would deliver a 120FPS experience at 1440P and high/ultra settings, and do so quietly with an undervolt.

gpderetta · 2024-02-27T09:23:00.000000Z

You can of course use NVidia closed source drivers.

dannyw · 2024-02-27T13:32:54.000000Z

You can also, just use Windows.

c0pium · 2024-02-27T01:41:54.000000Z

Do you own a Steamdeck?

fulafel · 2024-02-27T07:23:26.000000Z

AMD got burned when they did it the other way around in the Kaveri (2014) era, when they bet on unified memory, transparently passing pointers between gpu/cpu with unified virtual memory etc, and hoped that software would start to utilize GPUs more with OpenCL and other vendor neutral APIs providing portability.

me551ah · 2024-02-27T02:10:05.000000Z

It already exists and it’s the “ Ryzen Z1 Extreme”

The Ryzen Z1 Extreme uses AMD’s high-end Zen 4 APU configuration, with eight Zen 4 cores and six RDNA 3 WGPs.

The ROG Ally uses it and you can dedicate 50% of the memory to the GPU and it performs quite well.

nfriedly · 2024-02-27T02:24:12.000000Z

To be fair, I think the complaint of wanting more GPU and less CPU is reasonable, even in the context of the Z1 Extreme. 2 less CPU cores in exchange for more GPU cores, cache, and/or memory bandwidth would make it a more gaming oriented chip.

The RX 7600, AMD's lowest-end current-gen graphics card, has several times more CUs and memory bandwidth, so I think there's some room for improvement without undercutting their dGPU business.

The Z1 Extreme is essentially the same chip as the 7840U, AMD's highest tier U-series (15~30W) chip. In other words, it's a general-purpose design, not specifically gaming oriented.

stodor89 · 2024-02-27T17:18:59.000000Z

This is done on purpose. They've put so much effort into bullying casual gamers to splurge $300+ on a GPU. Why release a $150 SoC that's good enough for 95% of people? Free market, my ***!

klausa · 2024-02-27T03:01:24.000000Z

As a potential counterpoint — this is exactly how (baseline) PS4/Xbox One APUs were; and in turned out to be an extremely wrong bet!

rldjbpin · 2024-02-27T09:20:50.000000Z

modern consoles do not only run games, but by design have to support background tasks. iirc both xbox and playstation now uses some form of virtualization to run games, while the "hypervisor" is also doing other tasks such as capturing highlights, etc. while there are gpu-accelerated workloads for some of it, as per my understanding cpu is still used for these scenarios.

so it is better to think that only a part of the cpu is dedicated for running the game, so you'd need more cores than otherwise.

nfriedly · 2024-02-26T20:39:40.000000Z

I never really got the point of the Z1 non-extreme. If it enabled them to ship a device for $400 or less then it would make some sense as a Steam Deck competitor, but not when the Z1 ally was $600 and you could double the performance by going up to $700. (I know it's gone on sale for ~$450, but IMO that should have been the launch price, with sales putting it below $400.)

Honestly, the whole Z1 lineup puzzles me a bit, since it feels like just a re-badge of the regular mobile chips with some slight tuning. If AMD was going to go through the effort of making a gaming focused chip, they should lean into that with more emphasis on the GPU side, and perhaps some extra cache or more memory bandwidth. Maybe even drop a couple of CPU cores. What we got instead just feels like a cheap marketing ploy.

wtallis · 2024-02-26T20:58:12.000000Z

I'm not sure there was any special tuning or binning involved. I think Asus just bought special branding without having any influence on AMD's silicon roadmap.

brucethemoose2 · 2024-02-27T00:28:34.000000Z

Asus uses custom firmware for more performance consistency, but yeah the silicon is the same AFAIK.

wtallis · 2024-02-27T01:06:14.000000Z

I doubt their custom firmware is any more custom than any other mobile device. Every product needs custom fan curves and power and temperature limits.

bsimpson · 2024-02-27T05:55:15.000000Z

Lenovo got the same chip (Z1 Extreme).

agloe_dreams · 2024-02-26T22:10:40.000000Z

All of this is weird personally. I think it is weird that the 7840u (Z1X laptop model #) exists in the first place. It has a wildly overpowered GPU for a mobile chip of it's type and while they quote a 28W TDP, the actual max-power draw is stratospheric for a U series (well over 50 watts). To be honest, I wouldn't be shocked to find out the Z1X internally came, as a concept at least, before the 7840u. Meanwhile, the Z1 is so wildly weak in comparison. It is only 1/3 the TFlops.

user_7832 · 2024-02-27T01:15:13.000000Z

Someone else already commented, but as a user of a Framework 13 with the 7840/780m, the iGPU is definitely not overpowered.

The issue is that iGPU performance has been hardly changing for probably close to a decade, meanwhile dGPU performance has significantly risen (despite using older process/nodes at some times). What we’re seeing now is a correction. The meteor lake iGPUs apparently trade blows/are very similar to the 780m.

More specific to the 780m, its performance is apparently close to a gtx1060… a midrange gpu from 2016, about 7 years ago. It’s very nice to run a decade old game like gta5 at high, but modern titles like genshin can’t even hit 30fps on medium at 1x scaling. Also, not to mention that DDR is much slower than GDDR.

So: is the gpu better than previous gens? Definitely. Is it overpowered? No, previous gens have been underpowered.

williamDafoe · 2024-02-27T04:36:20.000000Z

Intel GPU performance was totally stagnant from Haswell (Iris Pto 5200) until Iris XE, from 2013-2019. AMD on the other hand was steadily building up their iGPUs, allowing the APUs to work together with external GPUs, and with the 5700g (which imho was a milestone) we finally had 30fps+ for all but the most brain-dead games (alan wake i'm looking at you!)

user_7832 · 2024-02-27T17:21:53.000000Z

Yep, Intel stagnated hard. I’d say it serves them right, they didn’t care to innovate when they were leading and now they’re getting served.

paulmd · 2024-02-28T11:28:25.000000Z

> Intel GPU performance was totally stagnant from Haswell (Iris Pto 5200) until Iris XE, from 2013-2019. AMD on the other hand was steadily building up their iGPUs

this is what I mean by the unconscious pro-AMD bias that people regularly engage in. not only is that not true at all (Iris Pro 6200 and Iris Pro 580 both are significantly faster), but actually Crystal Well is a very interesting/prescient design in hindsight, it is the type of "playing with multi-chip modules and advanced packaging" thing people get super excited about when it's AMD.

https://www.anandtech.com/show/6993/intel-iris-pro-5200-grap...

https://www.anandtech.com/show/9320/intel-broadwell-review-i...

https://www.anandtech.com/show/10281/intel-adds-crystal-well...

https://www.anandtech.com/show/10343/the-intel-skull-canyon-...

https://www.anandtech.com/show/10361/intel-announces-xeon-e3...

> allowing the APUs to work together with external GPUs

nobody does that because it sucks. best-case scenario is when your dGPU is roughly similar to your iGPU performance so you get "normal" SLI scaling... ie like 50% improvement. And in any scenario where your dGPU isn't utter trash, it completely ruins framepacing/latency etc.

> and with the 5700g (which imho was a milestone) we finally had 30fps+ for all but the most brain-dead games (alan wake i'm looking at you!)

that's actually because the 5700G is still using the 2017-era vega design, which was not that advanced technologically when it was introduced. AMD dropped support a while ago but the writing was on the wall for a long time before that.

AMD should have moved 5000 series APUs to at least RDNA1 if not an early RDNA2. The feature deficit is significant, a 2017-era architecture being sold in 2023 and 2024 (already unsupported) is just not competent to handle the basic DX12 technologies involved.

Alan Wake didn't do anything wrong, AMD cheaped out on re-using a block and it doesn't have the features. Doesn't have HDMI 2.1 or 10-bit or HDMI VRR support either... and the media block is antiquated.

and if you'll remember all the way back to the heady days of 2017... AMD bet on Primitive Shaders and could never get that to work right on Vega. PS5 actually has a primitive-to-mesh translation engine that does work. RDNA1 still does better than Vega but RDNA2 is where AMD moved into feature parity with some fairly important architectural stuff that NVIDIA introduced in 2016 and 2018.

https://www-4gamer-net.translate.goog/games/660/G066019/2023...

Again, people love to jerk about Vega (gosh it scales down so well!) but honestly Vega is a perfect Brutalist symbol of the decline and decay of Radeon in that era. There is no question that RDNA1 and RDNA2 are incredibly, vastly better architectures with much better DX12 feature support, much better IO and codecs and encoders, etc. Vega kinda fucking sucks and it's absurd that people still simp for it.

The "HBM means you don't have to care about VRAM size!" and other insanely, blatantly false technical marketing just sealed the deal.

But when AMD rakes-in-the-face with Vega or Fury X or RDNA3 it's "a learning experience" and maybe actually just evidence of how far ahead they are...

wtallis · 2024-02-26T22:49:30.000000Z

The 7840U is the same die as the 7940HS. AMD only did two mobile processor dies this generation; Phoenix 2 is in the Z1 and the larger Phoenix is in the Z1 Extreme, 7840U and 7940HS, among others. So they're doing a lot of product differentiation solely through adjusting power and clock limits, which is confounded by the leeway OEMs have to further adjust those limits.

jauntywundrkind · 2024-02-27T00:03:13.000000Z

I'm so so curious how much binning there is on modern chips, or whether it's 95%+ just a question of what settings are going to work for a given form factor.

Most of the various controls are sitting right there in Linux (or other OS) for power control of CPU & GPU. If cooled, could we just crank a Z1 Extreme up to a 100W core and have it be like a 7940HS? Or is there really some power binning differences?

I don't know what AMD charges for their cores. With Intel, there's been a decade of the MSRP of ~$279 for a chip, but the chips coming in a variety of different sizes across the power budgets; you'd pay the same for a tiny ultra-mobile core as you would for a desktop core. What we have now makes that look semi-ridiculous. It's the same chip. Different power budgets.

scns · 2024-02-27T09:45:04.000000Z

> If cooled, could we just crank a Z1 Extreme up to a 100W core and have it be like a 7940HS?

I think it shares the die with the 8700G which runs at 65W.

jauntywundrkind · 2024-02-29T06:44:35.000000Z

These numbers are broadly lies now. Many chips regularly exceed their TDP. Here's Tomshardware's review showing an 81W power consumption for the chip for a cpu only load. https://www.tomshardware.com/pc-components/cpus/amd-ryzen-7-...

And that only begins to dig in. I was asking about overclocking, which is going beyond the base TDPs. Here's the 8700G's 780M gpu hitting 156 watts on overclock, for the same die: https://www.tomshardware.com/pc-components/gpus/amds-radeon-...

I highly highly highly encourage blowing away any thinking that a chip says it's TDP is X watts so that's what we'd get. This chip has been seen in the wild drinking vastly vastly more power if given the chance & settings to. I think my question still stands, is there any binning or real difference that would keep a Z1 Extreme from doing similarly? Or are we just bound by how much power we can put in and how much heat we can take away from the Z1 Extremes out there?

nfriedly · 2024-02-26T22:24:42.000000Z

> I think it is weird that the 7840u (Z1X laptop model #) exists in the first place. It has a wildly overpowered GPU for a mobile chip of it's type...

I have a handheld with a 7840U (GPD Win Mini), and I love it. I do use it mostly for games, and I suppose if it were labeled a Z1 Extreme instead of a 7840U, I'd be just as happy with it. So I can somewhat see where you're coming from. But also I think it's becoming more common to want to run "real" (non-gaming) workloads that can leverage a GPU on devices without a discrete GPU, so I still think it makes sense as a general-purpose part. (Also, I think the Z1 was an Asus-exclusive part, at least initially, so if there wasn't a non-exclusive variant, then I'd be stuck with something inferior.)

> ...and while they quote a 28W TDP, the actual max-power draw is stratospheric for a U series (well over 50 watts).

The ideal TDP for that chip is around 18W, with diminishing returns after that. (I usually run mine at 7-13W depending on the game.) Beyond 25-30W, you get only marginal performance gains relative to the amount of additional power, so while it technically can use over 50 watts, it's clearly not designed for that and the extra performance isn't worth it when you're running on battery.

mey · 2024-02-27T00:35:20.000000Z

Not sure it's wildly overpowered. It's in the Framework 13's AMD lineup and works quite well.

ip26 · 2024-02-26T22:25:02.000000Z

Special tuning today, custom silicon tomorrow. Maybe this is the MVP of handheld PC gaming.

SomeHacker44 · 2024-02-26T23:31:04.000000Z

I wonder why they do not make more memory channels to get more performance out of the iGPU. 3 or 4 channels would be better than two. They could even allow segregation of traffic or memory QoS.

nfriedly · 2024-02-27T02:29:11.000000Z

There's a rumor that AMD is going to do that next year for their high-end mobile chip: double the memory bus size to 256 bit, and bump the iGPU CUs from 12 to 40!

See https://www.youtube.com/watch?v=ekCMnmD_EzA

rldjbpin · 2024-02-27T09:21:59.000000Z

how about making 3d cache of x3d sku made available for these igpu?

SunlitCat · 2024-02-26T20:26:32.000000Z

I really wonder why AMD is so reluctant to release desktop cpus with a powerful iGPU.

Yeah, I know the 8x00G exists, but it's kinda too little too late.

mdasen · 2024-02-26T21:28:15.000000Z

Others have pointed out things like memory bandwidth, but I think there might be another thing: would people want it?

Let's say that the CPU + powerful iGPU cost 95% of what a discrete CPU and GPU cost - but now you can't buy them separately, can't upgrade them separately, etc. You're less likely to get the mix of CPU and GPU that you're looking for since you can't select them independently. Why not just package the RAM with the CPU too? Apple's done that, but I think most people don't love that because it means they can't upgrade their RAM independently.

It also places constraints on how good something could be. Let's say that you produce new GPUs every 18 months and new CPUs every 12 months. Well, now you need to synchronize them. If the new CPU is ready to go, but the new GPU is 3 or 6 or 9 months out, what should your product releases be?

By having them separate, someone can buy the latest AMD CPU even though the next-gen GPU is 6 months out. When the next GPU comes out, they can buy that and upgrade the graphics and CPU on different cycles. Syncing up different product cycles isn't always easy.

I think the reason why is that they don't think there's likely a market. With things like a PlayStation or Xbox, it's going to (pretty much) have one set of capabilities for its 7-year lifecycle. You can integrate the CPU and GPU because there's only one buyer and because the CPU and GPU release have to be synced anyway for the console's release. With PCs, the release doesn't have to be synced and there are many buyers with different priorities.

ladyanita22 · 2024-02-27T06:28:48.000000Z

> Let's say that the CPU + powerful iGPU cost 95% of what a discrete CPU and GPU cost

The main advantage of APUs would be the costs (theoretically). If it ends up being 95% of the cost of a more traditional architecture, what would be the point?

nottorp · 2024-02-27T08:24:23.000000Z

I think the 8xxxG parts are 65 W, which means you have a better chance of being able to listen to a game even if you play without headphones. Also you'll annoy whoever else lives in that house less with the vacuum cleaner noise.

ladyanita22 · 2024-02-27T08:40:31.000000Z

I'm not sure I'd be willing to lose the ability of upgrading and the flexibility of tailoring my PC just to have a more silent PC.

nottorp · 2024-02-27T09:06:38.000000Z

It’s your hearing and concentration :)

Teever · 2024-02-27T00:31:35.000000Z

Having an iGPU doesn't prevent you from installing a graphics card.

ladyanita22 · 2024-02-27T06:29:26.000000Z

And then you'd pay much more because you're getting a powerful iGPU that you'll likely not use.

Teever · 2024-02-27T22:35:27.000000Z

Sure, but it's totally viable that some people on a budget would buy a cpu with an iGPU and then upgrade later.

jwells89 · 2024-02-26T20:40:14.000000Z

I’ve wondered this too. I would imagine that there’s a significant market for PCs with iGPUs roughly on par with those of consoles… that’s enough horsepower to play all esports titles extremely well as well as most other types of games on medium settings, which is more than good enough for a lot of people.

To work around memory issues, these CPUs would need some onboard memory which would increase costs a bit, but the tradeoff is that it’d make for simpler, cheaper low-end motherboards. One can imagine a mini-ITX board with nothing but a CPU socket and a couple of M.2 slots that’d cost significantly less than current entry-level ITX boards. A full system upgrade could be performed by simply swapping out the CPU which would be great for non-enthusiasts; without a power hungry discrete GPU, power requirements are unlikely to increase meaningfully (and in fact are likely to decrease with upgrades), so upgrading wouldn’t necessitate a PSU change. These hypothetical boxes could easily stay relevant for a decade or more.

Higher end SKUs of motherboards for this type of CPU could have the usual RAM slots (acting as a second tier of slower RAM in place of swap), PCI slots, etc.

sapiogram · 2024-02-26T21:11:08.000000Z

There just isn't enough memory bandwidth. Afaik dual-channel DDR5 still can't hit 100GB/s? Meanwhile the $270 RX 7600 has 288GB/s on-board memory.

bryanlarsen · 2024-02-26T21:51:26.000000Z

jwells89 says "onboard memory", by which I assume they mean on-package memory like the Apple Mx. The M2 Ultra has 800 GB/s of memory bandwidth...

adgjlsfhk1 · 2024-02-26T22:13:11.000000Z

DDR5-6000 is 60GB/s per channel.

jauntywundrkind · 2024-02-27T00:13:44.000000Z

It's 48GB/s, scaling up from -4800 being 38.4GB/s, according to https://www.servethehome.com/guide-ddr-ddr2-ddr3-ddr4-and-dd.... Pretty sweet but yeah we kind of need a big jump to get near GPU grade.

Intel's Lunar Lake mobile due out this year is supposedly using 16GB or 32GB on-package LPDDR5X RAM. Rumor has it it's 8533MHz. That'd be 68.2GB/s. https://www.tomshardware.com/tech-industry/manufacturing/int...

Still feels like we need more ram bandwidth. In general I think throughput is the new moat, the new market segmentation; consumer cores now have gobs of CPU and GPU (albiet we seem to have plateau'ed), but limited PCIe and ram bandwidth. We kind of started seeing USB4 compelling some more bandwidth, but I think even Intel is no longer offering USB4 on chip in many upcoming mobile chips, so that's kind of been defeated too.

Anyhow, AMD's next Strix Point is due end of year or there-after, but then their big Strix, Strix Halo has quad channels. That'll be exciting as heck. Folks may finally get their console crushers, perhaps.

adgjlsfhk1 · 2024-02-27T00:17:40.000000Z

oh the 60 was me trusting wikipedia which I think must have been counting the 10 bits before the on die ecc vs the 8 after.

0cf8612b2e1e · 2024-02-26T23:10:25.000000Z

I am this close to replacing my desktop (AMD 5500(?) + AMD 7600) with a 7840U (probably in the form of a Minisforum model). Day to day, I do some web browsing and Python programming. It seems silly to have this huge footprint machine sucking down probably 50-100W constantly when I can a book-sized machine that probably idles around 10W for minor performance difference in 99% of cases.

nottorp · 2024-02-27T08:28:24.000000Z

M2 mac mini on my desk uses 9.7 W right now at "idle". Quotes because I vnced into it to see the power consumption.

Under the desk there's a linux box made with a Ryzen 5600G, that one does 20W at idle.

jwells89 · 2024-02-27T00:30:13.000000Z

I can see the appeal.

My daily driver is an M1 Max MBP which I’m happy with, but if I were to build a Linux productivity box (no gaming, that’s handled by a different machine), something like a 7840U on an ITX board with a cooler just big enough to practically never be audible would sound pretty great.

scns · 2024-02-27T09:58:35.000000Z

All Ryzen 7000 have a small GPU. It is not meant for gaming but enough for productive work. You can get 12 cores at 65W with the 7900. Could be passively cooled for complete silence.

[edit] The 7900 delievers ~89% of the performance of the 7900X at 170W. 7700 & 7600 90% compared to the X counterparts.

pcchristie · 2024-02-27T00:54:12.000000Z

I'm the exact same, but moving from a desktop (which I've always had) to a laptop such as the Framework laptop. I'd just like 10-20% better graphics performance than what's available without losing the 13" form factor.

bryanlarsen · 2024-02-26T21:08:39.000000Z

An APU with a decent amount of fast on-package memory is going to be expensive. $1000 maybe? It'd be odd to pair a $1000 CPU with a $100 mobo. Maybe it's a good idea, but the market would find it confusing.

Atotalnoob · 2024-02-26T20:55:58.000000Z

On die memory increases costs significantly. It would be premium only to put any large on die memory sets

cduzz · 2024-02-26T21:12:34.000000Z

Even for chiplets?

I'd expect that you could probably get okay yields at okay costs if you are running a process that's a rev or two behind and making smaller chiplets that are then wired together after testing -- like the pentium pro's cache but for main memory to get 2 / 4 / 8gb ram all on "chip"

It'd probably cost more than a normal CPU, but the trade off is much more speed and the actual computer / motherboard at that point would just be a couple USB devices. You (the CPU maker) would grab a lot more of the per-unit profit.

Ah -- that's why they don't do it; no vendor would want their milkshake drunk...

eropple · 2024-02-26T21:26:46.000000Z

AMD does seem to be trying new stuff with Strix Point Halo, but in-package RAM seems like it'd add new physical challenges that probably need a defined market before they swing for the fences?

Because it's not just the memory chip, also the interposer it's stacked on top of, which now needs to be bigger, which means you need to find more room in the PCB (today) or a bigger silicon interposer (likely in the near future) which reduces yields, and so on and so forth. If you wanted to have more than the very limited SoC RAM, you'd also then be looking at having multiple DRAM controllers, which also adds to surface area, and so on and so forth.

mikepurvis · 2024-02-26T21:20:46.000000Z

Such a simple motherboard could even just put the power supply onboard, with a barrel plug or something, similar to how routers like this are configured:

https://www.mini-box.com/Alix-APU-Systems

tesseract · 2024-02-27T01:11:33.000000Z

So much peripheral functionality has been pushed inside the CPU that a modern desktop motherboard is from one point of view essentially a fancy DC-DC converter that happens to also handle I/O port breakout. The motherboard has to translate 12 volts at 10-ish amps from the PSU, into a highly stable 1.2-ish volts at 100-ish amps for the CPU power rail.

paulmd · 2024-02-27T05:57:12.000000Z

Many motherboards have 12V or 19V DC input barrels, I think thin-itx requires it.

Generally the keyword you want to search for is “industrial motherboard”, some server motherboards have them too. Asrock Rack and Asrock Industrial and Supermicro are great.

https://www.asrockrack.com/general/productdetail.asp?Model=A...

https://www.asrockind.com/en-gb/IMB-X1231

https://www.asrock.com/mb/AMD/X300TM-ITX/index.asp

(I fucking love asrock rack’s design team, absolute madlads and some seriously impressive density etc. they’re out there making the designs people don’t know they want, romed8-2T and genoad8x-2T are fantastic.)

Mini-box themselves sell dc-dc converters and “picoPSUs” that do this as well, although idk if they sell one with pcie power plugs. The M350 is a surprisingly high quality case for the price, and while my first picoPSU failed almost immediately (first shutdown iirc) they warrantied it no problem. A very funny trip through some oracle branded ordering/invoicing framework. Good people, good products.

My one criticism is that there is very obviously a lot of psu noise. I had a dell laptop and my pc desktop and the picopsu on an audio push-button switch and I kept getting a ton of ground loop and couldn’t figure it out etc and finally noticed it stopped and then came back as I plugged the M350 from the audio switch. Iirc it also showed through usb dac as well. I had the DIN plug brick and I think it has a lot of noise.

Intel NUCs are also extremely high quality implementations and have a great aftermarket with brands like akasa and hdplex etc.

wtallis · 2024-02-26T21:09:10.000000Z

"Some onboard memory" for a socketed CPU is impractical. They're not going to make a die with a wide DRAM controller for on-package memory and then a second narrower DRAM controller for memory slots, especially if the latter was only going to be used for high"end systems. It'll be soldered CPU and DRAM or the usual sockets/slots; I don't see how a mixed approach would make economic sense.

bryanlarsen · 2024-02-26T22:32:28.000000Z

On-package memory is coming to x86 laptop CPU's, it's a question of when rather than if, IMO. Apple MacBooks are killing x86 laptops, and a large part of the reason is the on-package memory. I'm sure Dell et al are screaming at Intel and AMD asking for competitive chips.

You're right that they likely won't make 2 different dies. Desktop AM5 chips will just get a package with some of the memory controller pins unconnected. The big question is whether they'll also package the full width laptop chip with on-package memory in a package for desktop that's incompatible with AM5.

If they don't, somebody is going to solder that monster laptop chip into an ITX motherboard. People will grumble about a motherboard that can't upgrade either the CPU or the memory, but if the performance is there they'll still buy it.

wtallis · 2024-02-26T23:07:24.000000Z

Intel already did on-package memory last year in a custom part for Asus that ran the memory at a higher clock speed than any other Raptor Lake part. It didn't accomplish much because the memory interface was still just 128 bits, it shipped very late, and now there are systems with Intel or AMD processors running at the same memory frequencies using memory that's off package and merely soldered on the motherboard next to the CPU as is traditional for LPDDR.

I don't think putting the memory in the package helps much with practically achievable clock speed or bus widths compared to just soldering the memory nearby. (Consoles and discrete GPUs aren't doing on-package memory despite running GDDR at significantly higher frequencies than LPDDR.) And given that, there's even less reason to expect a messy hybrid configuration with half the memory controller connected to on-package LPDDR5x and half routed to DIMM slots, whether or not it uses the AM5 socket.

bryanlarsen · 2024-02-27T12:12:44.000000Z

I wasn't suggesting hybrid, or at least didn't mean to. Memory with 2 different speeds is not well supported by the OS unless you dedicate one chunk to the GPU and the other to the CPU. So I don't think it's a good idea for AMD or Intel to go there.

What I would suggest (and my suggestions are worth about as much as I'm getting paid to write this):

AMD should create an I/O Die (IOD) with 512 bit DRAM memory width. Then create an AM5 variant with 128 of those connected to the socket and the other 384 tied off. Then create variants with 256 bit and 512 bits connected to on-package memory and no off-package memory pins. Sell those to laptop manufacturers.

A big motivation on laptops is that slow & wide memory uses a lot less power than fast & narrow; on-package also uses a lot less power than driving the signal between chips.

Then in early 2026 introduce AM6 with 4 channel 256 bit memory support. Sell AM5 & AM6 in parallel for a while.

wtallis · 2024-02-27T20:59:31.000000Z

Their existing chiplet approach is already horrible for laptops. Making the IO die bigger and more power hungry is not an option unless they switch to much more expensive packaging to get interconnect power down to reasonable levels.

AnthonyMouse · 2024-02-27T09:47:23.000000Z

> They're not going to make a die with a wide DRAM controller for on-package memory and then a second narrower DRAM controller for memory slots

Why not? They already do the equivalent for SRAM. The big cost of a wider memory bus is routing it through the socket and the system board, which you're not doing since only the narrower bus goes through there. The wider bus is solely within the APU package.

You could also take advantage of the additional channels -- have e.g. 8 memory channels within the package and two more routed through the socket, for a total of 10. Now if you have 8GB on the package and 16GB off of it, you have 10GB striped across all 10 channels and another 14GB striped across two.

Continuing to have memory slots also allows you to sell chips with and without on-package DRAM and use the same socket. High bandwidth memory only makes much sense if you have a strong iGPU, since CPUs are rarely memory bandwidth bound. But low end systems and high end systems wouldn't have that, only mid-tier ones would. The low end system has a small iGPU where HBM is both unnecessary and too expensive. The high end system has the same small iGPU, or none at all, because it uses a big discrete GPU.

HBM is also more expensive than ordinary memory, but Windows will sit there eating several GB of RAM while doing nothing, so having some HBM and some DDR should lower costs compared with having the combined total in HBM.

King1st · 2024-02-27T16:36:21.000000Z

HBM would be having it bandwidth reduced significantly to sync with the on system. Thats ignoring how the timings for everything would be ruined which is a pretty important and hard to manage thing for RAM.

AnthonyMouse · 2024-02-27T20:33:18.000000Z

These are two different design alternatives.

You use HBM or GDDR if you want very high bandwidth, like top-end discrete GPUs would get. But then the memory itself is more expensive and you want the external channels to reduce cost, so the OS bloat can go in the cheap memory and preserve the limited amount of high cost memory for what needs it.

This is notably not what Apple does -- they're just using ordinary LPDDR5 with a wide bus, equivalent to having a lot of memory channels. It gets them several hundred GB/s worth of bandwidth, similar to a midrange discrete GPU. If you were going to do that, you could put most of the channels within the package and still have two of them outside of it.

That sort of configuration would allow some flexibility. The on-package memory might have lower latency (if they're both just ordinary DDR this isn't going to be much difference if any), but if you configured the system to only interleave between the on-package memory channels then the "close" memory could achieve that lower latency. Interleaving the external channels into the same pool would have a small latency hit but increase bandwidth by e.g. 25%. Which could be configured in UEFI based on your expected workload.

PaulKeeble · 2024-02-26T20:38:14.000000Z

The G chips are also pretty seriously crippled by the cache reduction from 32MB to 16MB. It hurts their compute performance so much they behave similar to equivalent chips from a few generations before as well. I suspect its done to hit a power target but its an unfortunate trade off making these chips a bit of a let down.

ryukoposting · 2024-02-26T23:09:01.000000Z

If I had to guess:

1) physical space. There isn't a ton of leftover room once you've added compute and IO.

2) segmentation. A sufficiently powerful iGPU would cannibalize sales of AMD's own discrete GPUs.

3) board partners. A sufficiently powerful iGPU would rob AMD's GPU board partners of sales.

4) heat. Keeping the two hot things apart from each other makes both of them easier to cool.

5) memory bandwidth. Even the dinky iGPU in my Ryzen 2200G is heavily constrained by RAM clocks.

FWIW, I have a Ryzen 2200G hooked up to my TV, and it's totally adequate for casual gaming.

roenxi · 2024-02-26T21:46:54.000000Z

I don't think they are reluctant, their APU have started low end and been steadily [0] moving up to the performance spectrum. We're about on schedule for some powerful ones to hit the market. The 8000G is the first that has entered the conversation but I doubt it'll be the last.

[0] https://en.wikipedia.org/wiki/AMD_APU

wmf · 2024-02-26T20:29:08.000000Z

Any better iGPU would be limited by memory bandwidth. More bandwidth would require a different socket and probably more expensive motherboard.

SunlitCat · 2024-02-26T20:32:33.000000Z

Mhm, but it kinda hurts to see, that AMD is able to push out APUs powering the likes of a Playstation 5 and everything on a single chip, while on desktop you need to buy the cpu and a chunky gpu seperately.

bryanlarsen · 2024-02-26T20:41:02.000000Z

AM5 only has a 128 bit wide memory bus. (2 64 bit channels). So we're not going to get a usably beefier GPU until AM6.

However, on laptops which aren't constrained by backwards compatibility, Strix Point Halo appears to have both a beefy GPU and a 256 bit memory bus.

mjevans · 2024-02-27T01:28:10.000000Z

Exactly. Desktops also need to step up to using 4 channels of memory and thus a wider memory bus.

wmf · 2024-02-26T20:39:43.000000Z

The consoles are not socketed or upgradeable. That's the difference.

phire · 2024-02-27T00:48:02.000000Z

Really, it's a form factor limitation.

As others have pointed out, you can't fit enough memory bandwidth though the AM5 socket to feed a powerful iGPU.

Fixing this problem kind of requires abandoning the current desktop form factor and switching to a unified module with both the CPU and soldered-on memory. Though at that point, the motherboard is doing little more than power regulation and breaking out all the IO to various ports.

At that point, does it still count as a desktop form factor?

adolph · 2024-02-27T00:54:49.000000Z

> switching to a unified module with both the CPU and soldered-on memory. . . . At that point, does it still count as a desktop form factor?

Apple seems to think so

jamiek88 · 2024-02-27T18:44:38.000000Z

Apple and commodity don't mix and we're talking about commodities here and all the fat cutting that requires.

AnthonyMouse · 2024-02-27T10:15:13.000000Z

> Fixing this problem kind of requires abandoning the current desktop form factor and switching to a unified module with both the CPU and soldered-on memory.

Does it though? What stops you from putting some HBM onto the APU package and still installing it into the AM5 socket? It wouldn't even preclude you from continuing to use the memory slots, that memory would just be slower than the on-package memory.

phire · 2024-02-27T20:29:18.000000Z

Such a design would be possible, but not really commercially viable.

HBM memory is expensive, it requires a huge amount of extra IO on the die and an expensive silicon interposer. And you still need to keep around the old IO for the DDR5 memory. All that drives up costs, for what would still be a mid-range GPU.

Also, current software and games doesn't know how to deal with two pools of memory that have different performance characteristics, so the hardware would be underutilised.

The design which abandons the AM5 socket and switches to using much simpler and cheaper soldered-on gddr6 memory just ends up being cheaper and avoids the thermal and area limitations of the AM5 socket, so it could probably compete with high-end GPUs. It's just a better product direction.

AMD will either stick with their current strategy of APUs with their low-end GPUs because they would rather sell both a CPU and a dedicated GPU, or they will skip straight to a new form factor. The middle ground of trying to add more memory bandwidth to an AM5 style package just doesn't make any sense.

AnthonyMouse · 2024-02-27T21:11:48.000000Z

> HBM memory is expensive, it requires a huge amount of extra IO on the die and an expensive silicon interposer. And you still need to keep around the old IO for the DDR5 memory. All that drives up costs, for what would still be a mid-range GPU.

It's a product that replaces both a midrange CPU and a midrange GPU, which together have not only all of those costs but also the cost of needing two separate packages -- a CPU package for e.g. AM5 and a PCIe package for the GPU. Putting them together costs less.

> Also, current software and games doesn't know how to deal with two pools of memory that have different performance characteristics, so the hardware would be underutilised.

Except that's exactly what they know how to do, and they do it already. They expect there to be a slower pool of memory on the CPU and a separate faster one on a GPU. The system could expose them to existing applications in the same way -- the fast memory via the iGPU and the slow memory via the CPU. That's just software.

You could even do better if you have e.g. 16GB of fast memory and your game only needs 8GB of VRAM, because then the other 8GB can be used as L4 cache for the CPU and could plausibly fit the entire working set of the game in it while the only thing in DDR5 is the OS and idle background apps.

Meanwhile newer code which is aware of the configuration could make better use of it, e.g. by not having to worry about the cost of "copying" between GPU memory and CPU memory via PCIe since they're actually both directly connected.

> The design which abandons the AM5 socket and switches to using much simpler and cheaper soldered-on gddr6 memory just ends up being cheaper and avoids the thermal and area limitations of the AM5 socket, so it could probably compete with high-end GPUs.

That isn't cheaper, because then 100% of your system memory would have to be GDDR6. In a mid to high end system that's going to be dramatically more expensive than continuing to have a socket with DDR5 memory channels, because you'd have to replace e.g. 64GB of DDR5 and 16GB of GDDR6 with 80GB of GDDR6.

Meanwhile AM5 supports TDPs up to 170W and sTR5 up to 350W. 170W is reasonably sufficient for the combination of a midrange CPU and midrange GPU -- a midrange CPU is typically ~65W and a midrange GPU ~150W, which together hypothetically exceeds 170W, but a workload that simultaneously maxes out the GPU and all cores of the CPU is uncommon. In that rare case you would simply clock them slightly lower. Knocking the TDP of a 65W desktop CPU down to that of a laptop in that rare circumstance would have a relatively minor performance impact:

https://www.anandtech.com/bench/product/2685?vs=2665

350W would be sufficient at the high end for the same reason.

And if they were going to design a new socket (as happens from time to time anyway), they could give "AM6" a larger footprint and higher TDP without omitting the valuable external memory channels.

But creating a new CPU interface is expensive -- all the OEMs have to design new boards. So if your concern is cost then it makes more sense to wait until the next time you were going to do it anyway, e.g. when DDR6 becomes a thing, and do something else in the meantime.

rbanffy · 2024-02-26T20:28:21.000000Z

Could be that they are selling all the capacity they could hire. In that case, they'll aim to produce just the products with the highest margins (meaning the more specialized ones).

MrBuddyCasino · 2024-02-26T20:49:28.000000Z

They did this, nobody bought them.

Also the RAM bandwidth just isn’t there, and special mainboards with more memory channels eat up the cost advantage. And they’re hard to cool.

sliken · 2024-02-26T20:54:35.000000Z

I've had the same thought. They obviously know how to add decent iGPU and the required bandwidth in the PS5 and XboxX (and the previous gen).

Does seem like they finally plan to do this with the AMD Strix Halo which looks to hit somewhere late this year or early next.

Dylan16807 · 2024-02-26T21:55:35.000000Z

I'm not sure what you mean by "too late"?

SunlitCat · 2024-02-27T04:09:09.000000Z

Given they had mobile apus out with similar performance since quite some time, it's kinda baffling they just released a comparable desktop apu now.