Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm getting >30 tokens/sec using it with ollama and an M2 Pro. That might be a little slow though because I have a background finetuning job running.


Bit of a tangential question here, but any recommendations on how to get started fine tuning this model (or ones like it)? I feel like there are a million different tutorial and ways of doing it when I google.


https://github.com/OpenAccess-AI-Collective/axolotl

This is a wrapper around many training methods, and it has yielded many excellent community finetunes already.


Take a look at the QLoRA repo https://github.com/artidoro/qlora/ which has an example finetuning Llama. Made by the authors of the QLoRA paper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: