Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The SPU approach is pretty much what a GPU does, but instead of 64 they have hundreds if not thousands.

I think the main issue with the Cell design was that it was too "middle road" and wasn't specialised enough in either direction.



It really isn’t - the individual elements in a GPU can access RAM directly, while you needed to wheelbarrow data (and code!) to and from SPUs manually. This was the major pain point, not the extreme (for that time) parallelism or the weird-ish instruction set.


It depends on what you mean by "access RAM directly", both SPUs and GPUs (at least on PS4) can read/write system memory. Both do this asynchronously. If you want random access to system memory you will die on a GPU even sooner than on an SPU since latencies are much bigger.

So there is not much difference imho and the GP is correct.


SPUs read memory asynchronously by requesting a DMA; GPUs read memory asynchronously too, but this is not explicitly visible to the programmer, and serious infrastructure is devoted to hiding that latency. The problem with SPUs was never that they are slow, rather than they are outright programmer-hostile.


>but this is not explicitly visible to the programmer , and serious infrastructure is devoted to hiding that latency

What do you mean? You can hide latency on SPU by double-buffering the DMA but in a shader there is no infrastructure at all and no way to hide unlike SPU, you just block until the memory fetch completes before you need the data.

> they are outright programmer-hostile

Depends on the programmer I guess, I enjoyed programming SPUs, don't know personally anybody who had complaints. Only read about the "insanely hard to program PS3" on the internet and wonder "who are those people?". It's especially funny because the RSX was a pitiful piece of crap with crappy tooling [+] from NVidia yet nobody complaining about SPUs mentions that.

[+] Not an exaggeration. For example, the Cg compiler would produce different code if you +0/*1 random scalars in your shader and not necessarily slower code too! So one of the release steps was bruteforcing this to shave off few clocks from the shaders.


> The SPU approach is pretty much what a GPU does, but instead of 64 they have hundreds if not thousands.

Eh, only if you count every SIMD lane as a separate "core" like GPU manufacturer marketing does. More realistically, you should count what NVIDIA calls SMs, where the numbers are more comparable (GeForce RTX 3080 has 80, for example).


PS3 6 SPUs(1 is system reserved) 4-way SIMD

PS4 - 18 GCN CUs, each has four 16-ways SIMDs for 72 SIMDs but each is 4 times wider so PS4 has the same number of ways as in 288 4-ways SPUs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: