Hacker News new | past | comments | ask | show | jobs | submit login

If your disk has enough space to store the model, I think in theory you could run them, using the disk to store states. But it will be slow. I'm not sure how slow though, and also if anyone has implemented this. It actually should not be too difficult.



Disk makes no sense considering RAM is pretty cheap. But even then RAM is way too slow (and the communication overhead way too high). You probably get like a 100x slowdown or more.


I think you are overestimating compute and I/O for this model. If you assume it is RAM bandwidth bound, with a single channel top DDR4 you will get inference time as a low multiple of 7 seconds (200GB/25GBs). In a workstation you can have 8 channels.


12-channels in mine. 24-channels on some configurations, though I think that is the upper limit at this time, with a maximum density of 512GB per channel.


Is it multisocket?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: