More optimization work has and is going into Nvidia support, so those are currently faster. Pytorch support for MPS devices is relatively new, so there's a ton of optimization that hasn't been done yet, so it's not clear which underlying hardware is actually faster for this specific task, but it looks like the top end Apple Silicon is in the same bracket as a consumer-grade Nvidia GPU.