Probably not with the same amount of training time, but I'd imagine a recent MBP GPU could handle GPT-2 training. The biggest challenge is that the training would need to be reimplemented for Metal instead of CUDA.
Slightly off topic -- I just saw people saying how Mac's unified memory makes it a strength to train models on Macs: https://www.macrumors.com/2024/07/10/apple-leads-global-pc-g..., and how energy efficient they are etc. But what I am seeing is that people don't often even touch Macs at all -- they write code with CUDA and that's it. I find this kind of conversation fascinating.