Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The source I linked is the PyTorch model, should be all you need to run some epochs. IDK what the pretraining scripts are.


Doesn't the training script need to have a training loop at least? Loss calculation? A optimizer? The script you linked contains neither, pretty sure that's for inference only


Oof you're right - no loss function or optimizer in place, so you'd need add that plus pull in data + tokenizer to get a training loop going.

Apologies - you are right and I was wrong. I would edit my comments but they're past the edit window, will leave a comment accordingly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: