Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

30B models are in no way comparable to GPT-4 even to GPT-3. There is no spacial comprehension in models with less then 125B params (or I had no access to such model). 130B GLM seems to be really interesting as the crowd-source start though, or 176B BLOOMZ, which requires additional training (it is underfitted as hell). BLOOMZ was better then GPT-3.5 for sure, but yeah underfitted ...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: