Hacker News new | past | comments | ask | show | jobs | submit login

I've heard that before about floating point co-processors, they're definitely unified now.

Remember the Weitek and the 387 ?

In the end it's a cost-savings issue, as cpus get more cores they become more and more like GPUS, as GPUS acquire more general purpose instructions they become more like CPUs. Those lines will meet, once they're on one die the 'budget' can be used more efficiently by looking for ways to integrate them more tightly.

There is nothing inherently different about general purpose computations vs the kind of vector processing that a GPU is good at, at the end of the day it is all calculations, and more and more in parallel.

I fully expect that at some point even DRAM will be part of the CPU.




Point, but I'm not sure they're comparable. Both CPU and GPU are something you can never have enough of, because their task list expands to meet supply. Contrast: once you have enough FPU, you're done.


It's a game of bottle-necks. Solve one, you get another one for free.

Once you have 'enough' FPU you have not enough memory bandwidth, so you go wider / faster on the memory bus (this is already happening, we are now well over 1GHz on the memory bus), or you place the memory closer to the CPU (also happening, increased cache size).

Then as soon as that is done you now no longer have 'enough FPU', so you go parallel.

GPUs now have almost 250 cores and yet there are plenty of people that use more than one in a single machine (I've seen up to 4 of those, with two dies each for almost 2000 cores). Clearly some people don't have 'enough FPU' yet, and plenty of games would want more effects to add to their engines (which seems to be the biggest driver of this kind of development outside of hard core number crunching).

Increased resolution displays are another driver for more FPU power because once you start shading every pixel becomes the end point of a long pipeline of mathematical operations.

I don't foresee anybody complaining of 'too much FPU' in the next decade or longer, in fact I suspect that once this kind of FPU capacity becomes mainstream that we'll see whole new breed of applications to take advantage of it.


i'm not sure they're as different as you are making out. why couldn't you have a chip that's somewhere between a cpu, a gpu and an asic - it rewires itself to make a trade-off between cache+prediction v many alus+parallelism? maybe the asic part is a bit extreme, but fermi (the next gen nvidia) will have on-chip memory that is switchable between explicit local shared memory (gpu) and implicit local cache (cpu). why couldn't the same kind of flexible approach be made to instructions - longer+fewer v more+shorter pipelines?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: