Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The author is discussing fine-tuning a base model. How long it takes really depends on the dataset, the method, and the hyperparameters. DPO, for example, can achieve some great results with a fraction of the steps of other methods.

Just like with unsloth or axolotl, the people that use this will have to make compromises that give results in a reasonable amount of time.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: