No one is using FP64 for AI inference.

adrian_b · 2025-03-06T08:30:58 1741249858

I have not said any word about FP64.

I have just compared the FP32 computational capabilities, i.e. what is used for graphics, between the Apple M3 Ultra GPU and AMD server CPUs, because these numbers are easily available and they demonstrate the size relationships between them.

Both GPUs and server CPUs have greater throughputs for lower precision data (CPUs have instructions for BF16 and INT8 inference), but the exact acceleration factors are hard to find and it is more difficult to estimate the speeds without access to such systems for running benchmarks.