There's an interesting recent video here from Microsoft discussing Azure. The fo...

There's an interesting recent video here from Microsoft discussing Azure. The format is a bit cheesy, but lots of interesting information nonetheless.

https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs ChatGPT? Inside Microsoft's AI supercomputer"

The relevance here is that Azure appears to be very well designed to handle the hardware failures that will inevitably happen during a training run taking weeks or months and using many thousands of GPUs... There's a lot more involved than just renting a bunch of Amazon GPUs, and anyways the partnership between OpenAI and Microsoft appears quite strategic, and can handle some build-out delays, especially if they are not Microsoft's fault.