Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From brief look at the paper, they are doing a gradient descent of the architecture based on validation loss, which does good for efficiency, but its not ground breaking. The problem is that you are still training towards a target of a correct answer. I don't think this is gonna be applicable in the future, in the sense that we have to train on other things (like logical consistency somehow encoded into the network), as well as correct answers.


Your expectations are pretty high. Differentiable architecture search as you mentioned in the original comment is one thing; going beyond empirical risk minimization-based learning is another thing entirely. In fact, they seem mostly orthogonal.

That aside, it seems like AI has had the most empirical success by not imposing hard constraints/structure, but letting models learn completely "organically". The computationalists (the folks who have historically been more into this "AI has to have things like logical consistency embedded into its structure" kind of thinking) seem to have basically lost, empirically. Who even knows what Soar[1] is nowadays? Maybe some marriage of the two paradigms will lead to better results, but I doubt that things will head in that direction anytime soon given how massively far just having parallelizable architectures and adding more parameters has gotten us.

[1] https://en.wikipedia.org/wiki/Soar_(cognitive_architecture)


They expectations high, but its not so much as orthogonal as more basic. Our brains work on add/multiply/activation this is well known. But the composition of the neural connection strengths in our brain that makes us us is definitely not trained on any sort of final loss. Or at least not completely.


I'm not sure that AI has been successful recently because of its similarities to the human brain. It seems like the project of making human-like AI (in the sense of, models that function similarly to the brain) have had a lot less empirical success than the project of trying minimize loss on a dataset, whatever that takes. Like, look what happened to Hebbian learning, as you mentioned in your other comment. Completely absent from models that are seriously trying to beat SOTA on benchmarks.

Like, it really just seems like LLMs are a really good way of doing statistics rather than the closest model we have of the brain/mind, even if there are some connections we can draw post-hoc between transformers and the human brain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: