Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
tootyskooty
30 days ago
|
parent
|
context
|
favorite
| on:
What's the strongest AI model you can train on a l...
I suspect one can go a lot further by adopting some tweaks from the GPT-2 speedrun effort [0], at minimum Muon, better init and carefully tuning learning rate.
[0]:
https://github.com/KellerJordan/modded-nanogpt
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
[0]: https://github.com/KellerJordan/modded-nanogpt