Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It seems to depend on FlashAttention, so the short answer is no. Hopefully someone does the work of porting the inference code over!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: