> There could be exponential or quadratic scaling laws with any of these black boxes that makes one approach suddenly extremely viable or even dominant.
The reason I like the CPU approach is the memory scaling is bonkers compared to GPU. You can buy a server that has 12TB of DRAM (in stock right now) for the cost of 1 of those H100 GPU systems. This is enough memory to hold over 3 trillion parameters with full 32-bit FP resolution. Employ some downsampling and you could get even more ridiculous.
If 12TB isn't enough, you can always reach for things like RDMA and high speed interconnects. You could probably get 100 trillion parameters into 1 rack. At some point you'll need to add hierarchy to the SNN so that multiple racks & datacenters can work together.
Imagine the power savings... It's not exactly a walk in the park, but those DIMMs are very eco friendly compared to GPUs. You don't need a whole lot of CPU cores in my proposal either. 8-16 very fast cores per box would probably be more than enough, looking at how fintech does things. 1 thread is actually running the entire show in my current prototype. The other threads are for spike timers & managing other external signals.
The reason I like the CPU approach is the memory scaling is bonkers compared to GPU. You can buy a server that has 12TB of DRAM (in stock right now) for the cost of 1 of those H100 GPU systems. This is enough memory to hold over 3 trillion parameters with full 32-bit FP resolution. Employ some downsampling and you could get even more ridiculous.
If 12TB isn't enough, you can always reach for things like RDMA and high speed interconnects. You could probably get 100 trillion parameters into 1 rack. At some point you'll need to add hierarchy to the SNN so that multiple racks & datacenters can work together.
Imagine the power savings... It's not exactly a walk in the park, but those DIMMs are very eco friendly compared to GPUs. You don't need a whole lot of CPU cores in my proposal either. 8-16 very fast cores per box would probably be more than enough, looking at how fintech does things. 1 thread is actually running the entire show in my current prototype. The other threads are for spike timers & managing other external signals.