So glad you like it! If I understand your question correctly, yes, we are also p...

jmward01 · 2025-01-03T19:28:44 1735932524

Yeah. I looked at the dataset and there are a lot of possible tasks you could train against here since it has some great annotations. So, having a simple reference baseline, like a pretrain gpt2 run (which I think your repo is set up to do), helps give a starting point for other work. It looks like the dataset is small enough and the gpt2 ref code in your repo is lightweight enough to do a quick run and plot some curves on. Thanks!