Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just did the same thing and came here to post about it. Some really cool findings:

It doesn't play like stockfish, but it does seem to play pretty well.

If you specify a move or capture that is nonsensical it seems to always recognize that

You can stop and ask it about the tactical reasoning behind its moves, and it gives what look (to me) like good, specific answers

You can ask it if its concerned about an attack or defense I might make, and it will give a good answer. It will even answer accurately if the attack/defense is not possible

You can ask how the game is going (who is likely to win) and it will give an accurate answer with good reasoning.

You can ask it to try to meet specific alternative goals, like opening itself up to an en passant attack, and it will attempt to accomplish those goals. This one is really astonding since you're asking it to play in a way thats strange, and which is probably underrepresented in its training set.



Hi, I'm the author of the blog post. Just wanted to say, I would be very interested in reading more about the experiments you ran on getting GPT-4 to describe its plans. GPT-3's explanations were confidently incorrect, as usual.


Hi, since you're commenting here and appear to be interested in exploring things further, I just wanted to point out that once GPT-4 is available through the API you could use a LangChain Agent[1] to maintain the board state externally and feed it back in automatically with every new move, so that the playing field would be more level in terms of memory. You could also inject instructions about explaining its plans either as system messages or as per-prompt instructions.

[1] https://langchain.readthedocs.io/en/latest/modules/agents.ht...


If you instruct it to use an alternate Chess notation, such as drawing an ASCII (or Unicode?) board every turn, can it do so? Does this affect its play?

Instruct GTP4 to win the game and then crush it using Stockfish. Does it handle defeat gracefully?


Strangely, GPT4 refuses to draw a chessboard while earlier versions were up for it.

I do hope the chat interface improves soon. It seems like low hanging fruit compared to everything else that's been accomplished ... it will be like the jump from telnet/irc/ftp to web browsers.


(I'm referring to ChatGPT's interface specifically, of course. Third party clients could use GPT4 to build a GUI chessboard.)


Not -4, but I tested the ASCII gameboard idea a bunch on ChatGPT (3.5) and could not make any headway. It would consistently announce a move and then print out the same gameboard it had been given (i.e. the game state before its move).

It sounds like -4 should be much better though; I wonder if having the gameboard would improve its gameplay?


> I wonder if having the gameboard would improve its gameplay?

If it's an addition, then sure seems like it wouldn't hurt. If it's to replace the algebraic notation, then I'd say very likely worse. All of it's input / experiences w.r.t. chess have been in the language of algebraic chess notation. Swapping to only providing a game board would presumably be worse.

It'd be like if all of our interactions were in English in a latin script, and then someone replaced it with emojis. We don't think in emoji's it wouldn't help us, even if that's how they're used to thinking.

While we play chess by looking at the board state, it's likely little of chat-gpts input is in that format.

And the only other concern would be memory. Would it "forget" the prior moves. But I think not, simply because it receives the full game input each play simply by how GPT APIs work with passing in all prior tokens after each iteration.


GTP playing chess through a visual (or visual-like) interface would be like us playing chess through notation. Same game, different view.

Of course, some humans can play chess quite well while blindfolded.


GPT4 can take image inputs


However, I believe this functionality has not yet been released to people outside OpenAI.


... and companies with privileged access (Microsoft, Midjourney V5, etc.)

There is a waiting list https://openai.com/waitlist/gpt-4-api


Interesting idea, naively that seems possible based on other non chess example prompts I’ve seen. Would be a great way to understand it’s tracking of state throughout the game


My guess is that the OpenAI team saw the (viral) videos ridiculing ChatGPT's chess abilities and added some chess training dataset.


They finished training GPT4 mid last year, before ChatGPT was released


There's definitely some "Mechanical Turk"-ing going on behind the scenes.

I enjoy the ChatGPT jailbreaking scene, and almost every time there's a significant break found, within a day, ChatGPT will dance around it (take the big token-leak jailbreak from a few days ago.) The GPT4 model may not be retrained itself, but it seems fairly evident that the ability for supplemental rules/fine-tuning is baked into their architecture for ChatGPT.


Even with mechanical Turking it’s still an impressive claim. I have not seen any experiments which demonstrate that autoregressive language models can reliably play at any level, let alone superhuman levels. Until I see evidence to the contrary I will assume it’s not true.


Have you tried, after every move, asking GPT4 to identify "check , capture , and attack" of the position, to see how much applying the method helps its ELO rating. You can probably also ask it to render the board as txt, that should help GPT4 visualize the board.


The claims are worthless without actual games and chat logs.


Is there a reason why such AIs can’t offload to stockfish? To me the max power of AI could be achieved when it knows when to hand off to a better tool - is it a matter of ChatGPT not knowing what doesn’t know or simply the devs haven’t integrated with other tools (yet)?


We are starting to get there: https://viper.cs.columbia.edu/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: