After playing with AiDungeon I think you are right about the data not being up to par. It fails more frequently than it appears in the news. Has some brilliant moments too.
For example, when prompted to talk about deep learning it generated a nonsense paragraph. This is not unexpected, but when it generates news or dialogue it can be coherent on much larger pieces of text. Clearly shows it didn't read too much on the topic.
I can hardly make it do any math. Even simple things like 11+22= don't work. I expect the next 10x scale up will fill most of these holes, especially if they improve the training corpus quality and breadth.
For example, when prompted to talk about deep learning it generated a nonsense paragraph. This is not unexpected, but when it generates news or dialogue it can be coherent on much larger pieces of text. Clearly shows it didn't read too much on the topic.
I can hardly make it do any math. Even simple things like 11+22= don't work. I expect the next 10x scale up will fill most of these holes, especially if they improve the training corpus quality and breadth.