Hacker News new | past | comments | ask | show | jobs | submit login

The FP16 7B version runs on my Ubuntu XPS with 32GB memory, ~300ms/token. 13B also works but results aren't really good (the model will loop after a few sentences) so parameters probably need tuning.

So far I'm unable to reliably generate outputs in a different language than English, the model will very quickly start to translate (even if it's not asked) or just switch to English.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: