Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's also interesting that this opens up the full saturation of Apple Silicon (minus the ANE): GGML can run on the CPU, using NEON and AMX, while another instance could run via Metal on the GPU using MLC/dawn. Though the two couldn't share (the same) memory at the moment.


The GPU's ML task energy is so much lower that you'd probably get better performance running everything on the GPU.

I think some repos have tried splitting things up between the NPU and GPU as well, but they didn't get good performance out of that combination? Not sure why, as the NPU is very low power.


This was a really insightful explanation, thanks.

I have been wanting to get a beefier Mac Studio/mini m2 the more

I’m seeing Apple Silicon specific tweaked packages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: