Most of the shitty behavior of LLMs on syntactic and lexical tasks are due to th...

		Der_Einzige on June 9, 2023 \| parent \| context \| favorite \| on: Understanding GPT tokenizers Most of the shitty behavior of LLMs on syntactic and lexical tasks are due to the tokenizer and not due to the LLM itself. Having even tiny changes in tokenization has massive downstream effects on LLM behavior.