There was a post on here a few months ago about training using single characters as tokens instead of words, and it worked really well, being able to create new Shakespeare-like text despite not using human words as tokens. What a (human) word is can be learned by the model instead of encoded in the training set.
There was a post on here a few months ago about training using single characters as tokens instead of words, and it worked really well, being able to create new Shakespeare-like text despite not using human words as tokens. What a (human) word is can be learned by the model instead of encoded in the training set.