Hacker News new | past | comments | ask | show | jobs | submit login

Not only the it seems to be AI generated, it seems these guys don't even know about best practices or even what works. e.g. It contains archaic comparison of optimizers and its pros and cons, but for LLMs no optimizer other than Adam and new ones like Lion works.



Is there a paper on this? Why do no other optimizers give good results? Adam requires insane amounts of memory so alternatives would be welcome.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: