Hacker News new | past | comments | ask | show | jobs | submit login

So glad you like it! If I understand your question correctly, yes, we are also putting together a small library for training small language models. It's not mature at all yet, but can keep up with our progress here: https://github.com/danbraunai/simple_stories_train



Yeah. I looked at the dataset and there are a lot of possible tasks you could train against here since it has some great annotations. So, having a simple reference baseline, like a pretrain gpt2 run (which I think your repo is set up to do), helps give a starting point for other work. It looks like the dataset is small enough and the gpt2 ref code in your repo is lightweight enough to do a quick run and plot some curves on. Thanks!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: