Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's trained on Twitter data so I assume Reddit data as well.

Honestly feels like they're both pretty important datasets to ingest if trying to build a model on human speech, I reckon social medias, comment sections and co have the most natural human conversational text online.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: