Author had LLM help them make a tree of words, and the algo choose which node we're at and offers children as completions. It's clever and cute but, not even close to an LLM.
It's very far off, like "not even wrong" in the Pauli sense of the phrase.
There's a lot of abstractions one can have for this stuff, I think you're looking at that "text predictor" is one of them?
If you roll with that, then you're in a position where you're saying GPT-2 class LLMs were very close in 1960, because at the end of the day, it's just a dictionary lookup with a string key and a value of list<string> completions. That confuses instead of illuminates.
The trouble with decision trees for language modeling is that they overfit really hard. They don't do the magical generalization that makes LLMs interesting.