Being able to extrapolate beyond mere variations of the training data.
EDIT: A simpler example might be helpful. We could, for example, train a network to recognize and predict orbital trajectories. Feed it either raw images or processed position-and-magnitude readings, and it outputs predicted future observations. One could ask, "does it really understand orbital mechanics, or is it merely finding an efficient compression of the solution space?"
But this question can be reduced in such a way as to made empirical by presenting the network with a challenge that requires real understanding to solve. For example, show it observations of an interstellar visitor on a hyperbolic trajectory. ALL of its training data consisted of observations of objects in elliptical orbits exhibiting periodic motion. If it is simply matching observation to its training data, it will be unable to conceive that the interstellar visitor is not also on a periodic trajectory. But on the other hand if it really understood what it was seeing then it would understand (like Kepler and Newton did) that elliptical motion requires velocities bounded by an upper limit, and if that speed is exceeded then the object will follow a hyperbolic path away from the system, never to return. It might not conceive these notions analytically the way a human would, but an equivalent generalized model of planetary motion must be encoded in the network if it is to give accurate answers to questions posed so far outside of its training data.
How you translate this into AlphaFold I'm not so certain, as I lack the domain knowledge. But a practical ramification would be the application of AlphaFold to novel protein engineering. If AlphaFold lacks "real understanding", then its quality will deteriorate when it is presented with protein sequences further and further removed from its training data, which presumably consists only of naturally evolved biological proteins. Artificial design is not as constrained as Darwinian evolution, so de novo engineered proteins are more likely to diverge from AlphaFold's training data. But if AlphaFold has an actual, generalized understanding of the problem domain, then it should remain accurate for these use cases.
EDIT: A simpler example might be helpful. We could, for example, train a network to recognize and predict orbital trajectories. Feed it either raw images or processed position-and-magnitude readings, and it outputs predicted future observations. One could ask, "does it really understand orbital mechanics, or is it merely finding an efficient compression of the solution space?"
But this question can be reduced in such a way as to made empirical by presenting the network with a challenge that requires real understanding to solve. For example, show it observations of an interstellar visitor on a hyperbolic trajectory. ALL of its training data consisted of observations of objects in elliptical orbits exhibiting periodic motion. If it is simply matching observation to its training data, it will be unable to conceive that the interstellar visitor is not also on a periodic trajectory. But on the other hand if it really understood what it was seeing then it would understand (like Kepler and Newton did) that elliptical motion requires velocities bounded by an upper limit, and if that speed is exceeded then the object will follow a hyperbolic path away from the system, never to return. It might not conceive these notions analytically the way a human would, but an equivalent generalized model of planetary motion must be encoded in the network if it is to give accurate answers to questions posed so far outside of its training data.
How you translate this into AlphaFold I'm not so certain, as I lack the domain knowledge. But a practical ramification would be the application of AlphaFold to novel protein engineering. If AlphaFold lacks "real understanding", then its quality will deteriorate when it is presented with protein sequences further and further removed from its training data, which presumably consists only of naturally evolved biological proteins. Artificial design is not as constrained as Darwinian evolution, so de novo engineered proteins are more likely to diverge from AlphaFold's training data. But if AlphaFold has an actual, generalized understanding of the problem domain, then it should remain accurate for these use cases.