Memory bandwidth/latency is helpful in certain scenarios, but it can be easily oversold in the performance portion of the story. E.g. the 9950X and 9950X3D are within less than 1/20th of a percentage point of each other in PassMark Single thread (feeding a single core is dead easy) but have a spread of ~6.4% (in favor of the 9950X3D) in the multi-thread (where the cache is starting to help on the one CCD). It could just as easily have been in the other direction or 10 times as much depending on what the benchmark was trying to do. For most day to day user workloads the performance difference from memory bandwidth/latency is the "nil to some" though.
Meanwhile the AI Max+ 395 has at least twice the bandwidth + same number of cores and comes to more like a ~15% loss on single and ~30% loss on multithread due to other "traditional" reasons for performance difference. I still like my 395 though, but more for the following reason.
The more practical advantage of soldered memory on mobile devices is the power/heat reductions, same with increasing the cache on e-cores to get something out of every possible cycle you power rather than try to increase the overall computation with more wattage (i.e. transistors or clocks). Better bandwidth/latency is a cool bonus though.
For a hard number the iPhone 17 Pro Max is supposed to be around 76 GB/s, yet my iPhone 17 Pro Max has a higher PassMark single core performance score than my 9800X3D with larger L3 cache and RAM operating at >100 GB/s. The iPhone does have a TSMC node advantage to consider as well, but I still think it just comes out ahead due to "better overall engineering".
Meanwhile the AI Max+ 395 has at least twice the bandwidth + same number of cores and comes to more like a ~15% loss on single and ~30% loss on multithread due to other "traditional" reasons for performance difference. I still like my 395 though, but more for the following reason.
The more practical advantage of soldered memory on mobile devices is the power/heat reductions, same with increasing the cache on e-cores to get something out of every possible cycle you power rather than try to increase the overall computation with more wattage (i.e. transistors or clocks). Better bandwidth/latency is a cool bonus though.
For a hard number the iPhone 17 Pro Max is supposed to be around 76 GB/s, yet my iPhone 17 Pro Max has a higher PassMark single core performance score than my 9800X3D with larger L3 cache and RAM operating at >100 GB/s. The iPhone does have a TSMC node advantage to consider as well, but I still think it just comes out ahead due to "better overall engineering".