A back of the napkin calculation: 819GB/s / 37GB/tok = 22 tokens/sec.
Realistically, you’ll have to run quantized to fit inside of the 512GB limit, so it could be more like 22GB of data transfer per token, which would yield 37 tokens per second as the theoretical limit.
It is likely going to be very usable. As other people have pointed out, the Mac Studio is also not the only option at this price point… but it is neat that it is an option.
How many t/s would you expect? I think I feel perfectly fine when its over 50.
Also, people figured a way to run these things in parallel easily. The device is pretty small, I think for someone who wouldn't mind the price tag stacking 2-3 of those wouldn't be that bad.
Not sure why you are being downvoted, we already know the performance numbers due to memory bandwidth constraints on the M4 Max chips, it would apply here as well.
525GB/s to 1000GB/s will double the TPS at best, which is still quite low for large LLMs.
at 819 GB per second bandwidth, the experience would be terrible