Predicting the next character alone cannot achieve this kind of compression, bec...

hxypqr 9 months ago | parent | context | favorite | on: GPT-4.5 or GPT-5 being tested on LMSYS?

Predicting the next character alone cannot achieve this kind of compression, because the probability distribution obtained from the training results is related to the corpus, and multi-scale compression and alignment cannot be fully learned by the backpropagation of this model