Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interested in the new base model for fine tuning. Despite Llama3 being a better instruct model overall, it’s been highly resistant to fine-tuning, either owing to some bugs or being trained on so much data (ongoing debate about this in the community). Mistral’s base model are still best in class for small model you can specialize.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: