One of the most popular HSM is Thales Luna Network HSM, which can perform 20,000 ECC operations per second [1]. Even with the size of Azure AD, Microsoft may not need a lot of HSMs for signing purpose. HSMs are not particularly easy to manage though, maybe that is one of reasons they are not used as much as they should be.
[1] https://cpl.thalesgroup.com/encryption/hardware-security-mod...