I’ve been thinking this way too. Our brains have different areas with different ...

TeMPOraL · on May 11, 2023

... and perhaps a LLM that shares the latent space with the visual model, because those are apparently (and surprisingly) easily mapped to each other (at least per that one paper that popped up the other day).

The bit that's missing is on-line learning. There's only so much you can keep bouncing around in working memory (context window of all the component models) - eventually you want to "fix" some of the context by altering the weights of the models (a kind of gradual fine-tuning?).