Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most of the shitty behavior of LLMs on syntactic and lexical tasks are due to the tokenizer and not due to the LLM itself. Having even tiny changes in tokenization has massive downstream effects on LLM behavior.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: