Hacker News new | past | comments | ask | show | jobs | submit login

https://arxiv.org/abs/1803.05407

"Averaging Weights Leads to Wider Optima and Better Generalization"

Weirdly, the averaging doesn't have to be synchronous.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: