Playing Codenames with Language Graphs and Word Embeddings

maciejgryka · on May 16, 2021

Fun to see what this paper did - and Codenames itself is great!

It's best in person, of course, but it also works really well online with a bunch of friends. There's a free, official, online version here https://codenames.game/

And also, shameless plug, I built a clone in Elixir for fun a couple weeks ago: https://maciej.gryka.net/building-secretwords

thinkloop · on May 16, 2021

I too built a clone for fun: https://carefulclues.com/

I added a timer, and online matchmaking, but I'm not sure the game is suitable for online matchmaking, a big part of it is chatting and knowing your teammates and opponents. Otherwise I find it better playing online even when we're in person because the board doesn't do much, you just have to be able to see it and it's annoying to see it from different angles, nice to glance at on the phone.

stavros · on May 16, 2021

The online version is great! Unfortunately, the words version of the game is the inferior version. We have much more fun with the pictures version.

jerkstate · on May 16, 2021

As a ML layperson I had tried the first part a few months ago, using word2vec to find closest neighbors of word pairs and filter close clues of opponent words and it was really effective at finding single word clues (nearly 100%) but only about 20% at finding clues for pairs. It looks like their research finds clues for pairs with 57% effectiveness, but to my layperson reading they did not attempt to find clues for 3 or 4 words, which humans can regularly do in this game.

andyxor · on May 16, 2021

try transformer models, like BERT, for pairs or sentences,

I saw similarly that BERT is not as good as word2vec for single word similarity but shines for sentences (where context is important)

matthewfcarlson · on May 16, 2021

I tried this as a project for a masters course a few months ago. I found the performance to be decent but nowhere near what humans could do. I wonder where that 53% figure came from. When I tested the code master I wrote against the guesser, because they used the same embeddings it was able to get near a perfect score. I ended up creating a few overlapping datasets and adding some variation in order to better simulate real humans.

compressedgas · on May 16, 2021

https://arxiv.org/abs/2105.05885

https://github.com/divyakoyy/codenames

tuukkah · on May 16, 2021

Here's the Python implementation from the paper: https://github.com/divyakoyy/codenames

tuukkah · on May 16, 2021

Did anyone manage to get Pip to install the provided requirements.txt? I tried on the debian:buster-slim, ubuntu:latest and python:3 Docker images but annoy, numpy and/or scipy fail to compile.

Doxin · on May 18, 2021

Numpy does have wheels available, so in theory you shouldn't need to compile to install it. You might be running into trouble due to being on an odd architecture, or by running an old pip version.

In any case the compile is most likely failing due to missing C dependencies. Getting those installed should resolve issues with compiling.

tuukkah · on May 18, 2021

This is on amd64 and Docker. Old versions of packages don't have wheels for Python 3.9 and they don't compile on modern compilers and library versions. The Annoy package only has a wheel for macosx_10_14. I was finally able to get the packages to install simultaneously by changing the Python version from 3.9 to 3.7 and the specified scipy version from 0.19.1 to 1.5.4.

The next issue I bumped into is that some of the required data files are not available: https://github.com/divyakoyy/codenames/issues/1

eutectic · on May 16, 2021

I wonder if its feasible to identify the optimal (or at least a near-optimal) strategy if you treat it as a game of pure logic, where the players agree beforehand which attributes from a limited vocabulary apply to which words.

vikingerik · on May 16, 2021

The combinatorial of 25-choose-9 is only 2,042,975. If you have a vocabulary space of that size, you could guarantee always winning in one turn.

Doing it in two turns of 25-choose-4 and 21-choose-5 is easy, you need a vocabulary space of only 12k and 20k for the two steps. However, the second team could win in one turn in a smaller combinatorial space after the first team makes some guesses. 21-choose-8 is only 203k.

How you encode the combinatorial space to the game grid is an open question. The rules do disallow using the position of the words or the letters within them. It works if you have any legal method of comparatively sequencing the 25 words on the board.

(Of course, I'd never advocate actually playing the game like this, this is just mathematical speculation.)

anaerobicover · on May 16, 2021

Fascinating, however with English having estimated order of 1e9 words, probably only that two-turn play could be enacted. Making second team to win in certainty in that kind of arms race.

cochne · on May 16, 2021

In order to mitigate the second team winning because of fewer possibilities after guessing, the first team could plan to not guess any on the first round, and then use an 'infinity' hint on the second round.

mathgeek · on May 16, 2021

The game is sufficiently solvable as a 5-by-5 map without any concern for the words in each cell. Generally, it’s more interesting to study the actual gameplay around word association than to study the optimal winning strategy. There is some room to study whether the association between words could further optimize the naive approach. Another enjoyable side strategy involves optimizing for the opponent’s strategy.

Der_Einzige · on May 16, 2021

Seems similar to my own take at word embedding related games...

https://github.com/Hellisotherpeople/Language-games