If you understand "cost-effective" to mean the same thing as "feasible with toda...

famouswaffles · 2025-10-02T13:54:28 1759413268

It wouldn't take decades or years of compute to train a language model that doesn't tokenize text first. It's not an 'unproven hypothesis' because it's already been done. It's just a good deal more cost effective to tokenize so those exercises aren't anything more than research novelty.

necovek · 2025-10-02T16:27:46 1759422466

It did not sound like that's the only preprocessing step, but even with that, how "costly" would that be for a model comparable to ChatGPT 4 or 5?

Also, the comment was not related to LLMs only.

Note that the goal is to get comparable performance, iow to compare like for like.