I love the idea of cognitive AI. I've dabbled with OpenCog https://en.wikipedia.org/wiki/OpenCog , and in my youth I read "Artificial General Intelligence (2005)" by Ben Goertzel and I was really convinced.
But it seems like... cognitive AI hasn't paid off? Big & deep Neural Nets are the type of ML/AI that are achieving milestones in learning, gameplay and tasks.
If someone has a strong case to make for cognitive programming, I'd love to hear it. But right now it seems like it's a heuristics-based system that's destined to lose to NNs.
(And before the nitpickers arrive, I totally grant that using heuristics for toy robots makes sense, and is a good way to expose people to programming.)
FWIW, classical algorithms(with a small neural net for eval) is still the strongest approach for chess, consistently beating out more heavily NN based approaches in TCEC. And I'm not an expert, but I'm pretty sure the strongest AIs for various complex games like Starcraft 2 have a strong cognitive component while using neural nets for particular subtasks.
AlphaGo/MuZero are completely NN based and were so far ahead of the competition when they were developed they led to the whole wave of NN-for-eval that we seen now. And AlphaGo/MuZero doesn't compete in TCEC.
The chess community (especially the stockfish programming group) is very focused on improving their own system. I don't think the fact that is a the strongest system really means much - it's pretty clear they are leaving performance on the table. For example it wasn't until last year (!) that they moved to a GPU based training system.
AlphaGo/MuZero/LeelaZero are not completely NN-based. They're still Monte Carlo Tree Search. Completely NN would have to be some sort of bitboards in moves out thing. That's not a thing afaik. They rely a lot more on the NN though is my understanding, sacrificing search depth for the compute.
LeelaZero does compete, and Stockfish tends to outcompete it. It's my feeling that MCTS, while necessary with go's branching factor making minimax approaches untenable, is not actually a win in chess compared to alphabeta paired with the decades of heuristics that have been developed. I have some complex reasons for believing this. One of them is that Stockfish' move ordering relies heavily on statistical "learning"(histories, continuations, killer move heuristic, etc) heuristics that are probably more powerful if your search is wider. This probably applies to Transposition/pawn tables as well.
More importantly, I'm sceptical of the amount of useful inference that can be made from a static position in chess vs go. I barely even know the rules of go, but I am an expert level chess player. I think go has a lot more static structure that likely makes larger NNs more useful than they are in chess. Chess is far more illuminated by specific tactical themes in the position that are very effectively picked out through qsearch paired with move ordering heuristics and caching. With chess, I think there's some point when inferring anything more is just gonna involve search anyway. And you don't want ad hoc search appearing in the eval function because that won't benefit from tables and move ordering heuristics.
And yes, certainly there's lots more to gain for Stockfish, and their incessant improvement is impressive. And obviously the size of their community is a huge advantage. But I don't actually think MCTS is the way to go for chess, ultimately. Minimax based techniques are just too damn effective, with a sprinkle of NN.
Shogi is an interesting in between here. Conceptually like chess, but with a much higher branching factor. I'm not at all aware of the state of the art of shogi minimax search though.
Yes, you are right that MCTS (or some search at least) is an important factor in addition to NN based eval.
If you mean this for "classical algorithms" then sure.
But in the context of AI "classical algorithms" is more like logic programming (eg Prolog) or similar in the "cognitive programming" vein that led off this discussion. I don't think these types of classical algorithms are competitive.
If you look behind logic programming at what powered deduction engines, you will find that its core is very much tree search algorithms like depth-first backtracking search or iterative deepening, which IIUC, is also central in chess engines.
The rest is basically regular programming in that its largely bound by how well the programmer models. For a large number of things, the scope of what a human can consider lags an optimization process. In chess, it at first seemed like this might be the case there too, but latest performance points to maybe not this time.
I guess I've never thought of search as being either connectionist or non-connectionist. It's self-evident that people search (eg look up in books) things that aren't in local memory.
I think the distinction is how the looked-up knowledge is integrated.
The non-connectionist, classical, symbolical aspect is not so much "search" per se, but the fact that MCTS performs a form of look-ahead, which requires manually encoding the rules of the game in some fashion.
Determining what the legal moves in a state are, and what state results from a given move, is not something the AI would learn by itself, but is programmed into the system by a human expert.
Indeed, in this latest iteration the network learns a model of the game by itself. Note however that it still uses MCTS and performs a look-ahead, i.e. it is still being "told" by the programmers how to do search (planning), only now it is left to the system to determine how it wants to represent states/actions/policies internally.
So I just googled "differentiable Monte Carlo" and it looks like that concept has so far only been applied to ray tracing, but I would be shocked if that (or similar) doesn't become a thing in the next year or three, such that AlphaGo et al become end-to-end differentiable.
There's more to Monte Carlo Tree Search than the Monte Carlo part though. It's a tree search algorithm using Monte Carlo methods to direct the search, so it would still not be a complete NN engine.
Also, there's no reason to expect a "naked" neural net to ever hold up against current chess engines. Engines are already far beyond the strongest humans, who can be seen as naked NN players. Except the NN is far beyond anything we could even dream of building today, both in number of parameters and the complexity of a single neuron, not to mention our ability to learn.
NNs are interesting in areas where human performance is not yet attained. Chess is not one of those. Heuristic algorithms are far stronger than humans, and so it seems strange to me to suggest making engines stronger by making them more and more like humans, just with far less compute, and weaker learning methods. The notion seems inherently ill-conceived to me.
> stronger by making them more and more like humans, just with far less compute, and weaker learning methods. The notion seems inherently ill-conceived to me.
human brain while may have larger raw compute power comparing to TPU pod has also many bottlenecks, brain works on much slower frequency per current research than 2GHz TPUs, inputs are very bottle-necked, you can't feed brain with 1TB of text in one day like you can do with neural network.
You're right of course. Mainly by compute I'm referring to number of and complexity of neurons, not frequency.
You put your finger on something though because the bottlenecks of brains are probably the reason why traditional minimax searchers are so good at chess: working memory. Humans just don't have the working memory to search to the depths computers can. frequency is interesting, but chess is inherently "working memory-bound". If you gave Stockfish a normal time control and a grandmaster two days per move, Stockfish would still win. Past about 30 minutes of thought on a position, there's not much more progress a human can make. But if you also gave the GM a second board they were allowed to make moves on, and a notebook, I bet the GM would win, because at that point you're only comparing eval functions, and a GM would still be miles ahead. I wonder if someone's done this experiment.
Being differentiable does not make something a neural network nor does it mean the differentiated function is meaningful (such as can be the case in highly branching or otherwise complex control flow). It also doesn't always escape combinatorial explosion and that fixed structure limited precision strictly bounds how far a neural net could look ahead.
By making something differentiable, you're hoping relaxations or some smooth proxy doesn't break fundamental structure, allowing you to solve an easier problem than discrete combinatorial search. Sometimes there can be barely any leveragable structure. This has proved very difficult to achieve in general and is a sort of holy grail in some fields.
That said, there are powerful game playing AIs that use either no (DeepNash) or minimal (DORA) search. But in general you can't get something for free, you're always paying something with an approximation.
it could depend on downstream task, is it differentiable or not. Ray tracing of convex textures is likely differentiable, win/loss function for a given chess move maybe not.
Oh, that feels so retro. That's the kind of stuff I was taught going through Stanford in the 1980s, at the beginning of the "AI Winter".
I've actually done a blocks world on a real robot (an IBM RS-1, of all things) using that approach, programming in LISP. Yes, it worked. No, it can't deal with an unstructured or dynamic world. Predicate calculus is fine for program proving, but terrible for real world robots.
But it seems like... cognitive AI hasn't paid off? Big & deep Neural Nets are the type of ML/AI that are achieving milestones in learning, gameplay and tasks.
If someone has a strong case to make for cognitive programming, I'd love to hear it. But right now it seems like it's a heuristics-based system that's destined to lose to NNs.
(And before the nitpickers arrive, I totally grant that using heuristics for toy robots makes sense, and is a good way to expose people to programming.)