Hacker News new | past | comments | ask | show | jobs | submit login

There's an interesting recent video here from Microsoft discussing Azure. The format is a bit cheesy, but lots of interesting information nonetheless.

https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs ChatGPT? Inside Microsoft's AI supercomputer"

The relevance here is that Azure appears to be very well designed to handle the hardware failures that will inevitably happen during a training run taking weeks or months and using many thousands of GPUs... There's a lot more involved than just renting a bunch of Amazon GPUs, and anyways the partnership between OpenAI and Microsoft appears quite strategic, and can handle some build-out delays, especially if they are not Microsoft's fault.




That is only relevant for serving and not for inference, unless the model is too big to fit on a single host (typically 8 GPUs).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: