I cannot understand where the boundary between some of the "common-yet-boring arguments" and "real limitations" is. E.g., the ideas that "You cannot learn anything meaningful based only on form" and "It only connects pieces its seen before according to some statistics" are "boring", but the fact that models have no knowledge of knowledge, or knowledge of time, or any understanding of how texts relate to each other is "real". These are essentially the same things! This is what people may mean when they proffer their "boring critiques", if you press them hard enough. Of course Yoav, being abrest of the field, knows all the details and can talk about the problem in more concrete terms, but "vague" and "boring" are still different things.
I also cannot fathom how models can develop a sense of time, or structured knowledge of the world consisting of discrete objects, even with a large dose of RLHF, if the internal representations are continuous, and layer normalised, and otherwise incapable of arriving at any hard-ish, logic-like rules? All these models seem have deep seated architectural limitations, and they are almost at the limit of the available training data. Being non-vague and positive-minded about this doesn't solve the issue. The models can write polite emails and funny reviews of Persian rags in haiku, but they are deeply unreasonable and 100% unreliable. There is hardly a solid business or social case for this stuff.
Same with limericks. When asked to produce villanelles, ChatGPT about half the time comes up with perfectly good ones, and half the time completely whiffs on the concept. ChatGPT seems to "know" that sestinas consist of stanzas of six lines each, but otherwise completely fails to follow the form.
I also cannot fathom how models can develop a sense of time, or structured knowledge of the world consisting of discrete objects, even with a large dose of RLHF, if the internal representations are continuous, and layer normalised, and otherwise incapable of arriving at any hard-ish, logic-like rules? All these models seem have deep seated architectural limitations, and they are almost at the limit of the available training data. Being non-vague and positive-minded about this doesn't solve the issue. The models can write polite emails and funny reviews of Persian rags in haiku, but they are deeply unreasonable and 100% unreliable. There is hardly a solid business or social case for this stuff.