I just did the same thing and came here to post about it. Some really cool findi...

v_london · on March 19, 2023

Hi, I'm the author of the blog post. Just wanted to say, I would be very interested in reading more about the experiments you ran on getting GPT-4 to describe its plans. GPT-3's explanations were confidently incorrect, as usual.

lgas · on March 19, 2023

Hi, since you're commenting here and appear to be interested in exploring things further, I just wanted to point out that once GPT-4 is available through the API you could use a LangChain Agent[1] to maintain the board state externally and feed it back in automatically with every new move, so that the playing field would be more level in terms of memory. You could also inject instructions about explaining its plans either as system messages or as per-prompt instructions.

[1] https://langchain.readthedocs.io/en/latest/modules/agents.ht...

Buttons840 · on March 19, 2023

If you instruct it to use an alternate Chess notation, such as drawing an ASCII (or Unicode?) board every turn, can it do so? Does this affect its play?

Instruct GTP4 to win the game and then crush it using Stockfish. Does it handle defeat gracefully?

mmahemoff · on March 19, 2023

Strangely, GPT4 refuses to draw a chessboard while earlier versions were up for it.

I do hope the chat interface improves soon. It seems like low hanging fruit compared to everything else that's been accomplished ... it will be like the jump from telnet/irc/ftp to web browsers.

mmahemoff · on March 19, 2023

(I'm referring to ChatGPT's interface specifically, of course. Third party clients could use GPT4 to build a GUI chessboard.)

fenomas · on March 19, 2023

Not -4, but I tested the ASCII gameboard idea a bunch on ChatGPT (3.5) and could not make any headway. It would consistently announce a move and then print out the same gameboard it had been given (i.e. the game state before its move).

It sounds like -4 should be much better though; I wonder if having the gameboard would improve its gameplay?

BoiledCabbage · on March 19, 2023

> I wonder if having the gameboard would improve its gameplay?

If it's an addition, then sure seems like it wouldn't hurt. If it's to replace the algebraic notation, then I'd say very likely worse. All of it's input / experiences w.r.t. chess have been in the language of algebraic chess notation. Swapping to only providing a game board would presumably be worse.

It'd be like if all of our interactions were in English in a latin script, and then someone replaced it with emojis. We don't think in emoji's it wouldn't help us, even if that's how they're used to thinking.

While we play chess by looking at the board state, it's likely little of chat-gpts input is in that format.

And the only other concern would be memory. Would it "forget" the prior moves. But I think not, simply because it receives the full game input each play simply by how GPT APIs work with passing in all prior tokens after each iteration.

Buttons840 · on March 19, 2023

GTP playing chess through a visual (or visual-like) interface would be like us playing chess through notation. Same game, different view.

Of course, some humans can play chess quite well while blindfolded.

charcircuit · on March 19, 2023

GPT4 can take image inputs

_flux · on March 19, 2023

However, I believe this functionality has not yet been released to people outside OpenAI.

WinstonSmith84 · on March 19, 2023

... and companies with privileged access (Microsoft, Midjourney V5, etc.)

There is a waiting list https://openai.com/waitlist/gpt-4-api

nhoughto · on March 19, 2023

Interesting idea, naively that seems possible based on other non chess example prompts I’ve seen. Would be a great way to understand it’s tracking of state throughout the game

janmo · on March 19, 2023

My guess is that the OpenAI team saw the (viral) videos ridiculing ChatGPT's chess abilities and added some chess training dataset.

Yenrabbit · on March 19, 2023

They finished training GPT4 mid last year, before ChatGPT was released

z3c0 · on March 19, 2023

There's definitely some "Mechanical Turk"-ing going on behind the scenes.

I enjoy the ChatGPT jailbreaking scene, and almost every time there's a significant break found, within a day, ChatGPT will dance around it (take the big token-leak jailbreak from a few days ago.) The GPT4 model may not be retrained itself, but it seems fairly evident that the ability for supplemental rules/fine-tuning is baked into their architecture for ChatGPT.

janalsncm · on March 19, 2023

Even with mechanical Turking it’s still an impressive claim. I have not seen any experiments which demonstrate that autoregressive language models can reliably play at any level, let alone superhuman levels. Until I see evidence to the contrary I will assume it’s not true.

GistNoesis · on March 19, 2023

Have you tried, after every move, asking GPT4 to identify "check , capture , and attack" of the position, to see how much applying the method helps its ELO rating. You can probably also ask it to render the board as txt, that should help GPT4 visualize the board.

mtlmtlmtlmtl · on March 19, 2023

The claims are worthless without actual games and chat logs.

wuiheerfoj · on March 19, 2023

Is there a reason why such AIs can’t offload to stockfish? To me the max power of AI could be achieved when it knows when to hand off to a better tool - is it a matter of ChatGPT not knowing what doesn’t know or simply the devs haven’t integrated with other tools (yet)?

dimtion · on March 19, 2023

We are starting to get there: https://viper.cs.columbia.edu/