The second-generation Worms games (2, Armageddon, WWP) use the same approach (here called "Deterministic action propagation method"). It was implemented by Karl Morton, and we also use it for replay files (which is why the game logic is meticulously versioned).
The solution we used for managing secret state, when the state is from a random source, is to delay generating the secret information for as long as possible (so, e.g., if a card is drawn, the drawn card is decided at that moment, as opposed to pre-shuffling the deck at the start of the game). We experimented with some designs in which the secret is revealed only to the player who owns it, and the rest only see a verifiable hash of the secret until the player performs an action which reveals the secret; however, it doesn't solve the general problem that any time the game must immediately (without network latency) make a random choice, the design must compromise between either allowing a hacked client to predict the result (when the RNG output is stable) or manipulate it (when the RNG output changes very often, e.g. every frame).
Sadly, the newer games went for a simpler synchronization algorithm, in which if a discrepancy is detected, the entire state is simply copied over from one player's game to another. This does allow blatant cheating; I'm guessing it was a concession due to the new engine lacking in robustness or portability.
This is a really interesting approach for managing secret state. Another life ago I implemented a rule-based version of everyone's favorite collectible card game, and it was P2P (so each client ran the full rules engine and synchronized the initial random seed and the each player's actions). I could never figure out a good way of limiting cheating by peaking at either your own deck or the opponents. I looked into commutative encryption and mental poker, but at that point my interest in the project fizzled and I stopped working on it. This approach would work for keeping randomness in the game, and homomorphic encryption could be used to hide the contents of the opponent's deck.
There's a third way of thinking, best exemplified by Agar.io and things like that which basically asserts that any network latency is the player's problem. There's a deterministic state on the server, and the UI doesn't bother to try to sugarcoat it to make it feel any more instantaneous than it is. IMHO this is the only way most games will be written down the line. I'm an old casino coder, so the "tricks" were things like, slow the card deal animation; keep the roulette wheel animation going longer; etc. But where money's concerned you can't take shortcuts, and in the end a player with a faster connection has a better experience, and (in casino land) those are the players with money anyway. I'm not sure I understand why there'd ever need to be an instant result (sans network latency) in a turn-based game. If the player has a visual indication that the server is waiting for their command, then any jumpy movement on the part of other players is strictly down to their own bad connection as long as the server is running well. If players understand that the alternative would be the ability to cheat, they'll generally accept that and upgrade their network connection.
> I'm not sure I understand why there'd ever need to be an instant result (sans network latency) in a turn-based game. If the player has a visual indication that the server is waiting for their command, then any jumpy movement on the part of other players is strictly down to their own bad connection as long as the server is running well.
Although the Worms games are turn-based, gameplay typically involves performing a lot of actions within one turn. Walking, jumping, using utilities (which don't end your turn), and finally firing a weapon. Many of these actions can trigger an event with a random outcome, which should ideally be unpredictable and un-manipulatable, such as which weapon will be found in a weapon crate, or how long the fuse of a mine with a random fuse will be.
In a peer-to-peer game, the server is just another player, and should not be considered intrinsically more trustworthy. Ignoring latency, it's possible to implement a scheme of securely generating random data, in which everyone produces a verifiable hash (commitment) of a random number, and after everyone announced their hash everyone announces their random number, which is then XORed together to produce the outcome (see e.g. Keybase's "Cryptographic coin flipping"). The problem with this approach is that the latency to obtain the result becomes the roundtrip to the player with the slowest connection and back.
In a case like Worms where there are concurrent actions and random reveal choices inside each turn... If some deterministic state and random number is created for each player at the start of the turn, then their inputs can be reproducibly mapped to the results of all the random choices. In that case you can just upload the inputs and let the server work out what happened based on the hash the player was given (i.e., let the server replay each player's inputs against that hash and figure out the result).
But the server is not another player. It's not a player at all. It's the arbiter of truth. So as you said:
>> The problem with this approach is that the latency to obtain the result becomes the roundtrip to the player with the slowest connection and back.
This is what I'm saying above. At some point games will just accept this and allow the player with too slow of a connection to be disadvantaged. To the extent that a server should tolerate slow connections, it is a handicap given by the game to players with slow connections.
> In a case like Worms where there are concurrent actions and random reveal choices inside each turn... If some deterministic state and random number is created for each player at the start of the turn, then their inputs can be reproducibly mapped to the results of all the random choices. In that case you can just upload the inputs and let the server work out what happened based on the hash the player was given (i.e., let the server replay each player's inputs against that hash and figure out the result).
No, this is the worst of the two cases: it allows predicting all outcomes at the start of the turn AND allows manipulating the outcome. So, in a territory control scheme like Team17, you can choose whether to pick up that crate now, or wait a turn or two until picking it up will result in a better outcome.
> But the server is not another player. It's not a player at all. It's the arbiter of truth.
No, that's not how peer-to-peer games work. In the current implementation, it is the "arbiter of truth" because it's the "server" in the TCP/IP sense, but it is still run and controlled by a player, and should not have any intrinsic advantage.
> At some point games will just accept this and allow the player with too slow of a connection to be disadvantaged.
Maybe games with a centralized networking model. We deliberately avoided doing anything in this direction for Worms Armageddon because it puts a death clock on the game and the community - see all games which are no longer playable because the developer/publisher decided that keeping the servers online was no longer profitable.
> To the extent that a server should tolerate slow connections, it is a handicap given by the game to players with slow connections.
In the hypothetical secure scheme I described above, ALL players are affected, not just the slowest one.
> [the server] is still run and controlled by a player
Just to drive this point home for anyone who hasn't played Worms: the developer (Team17) doesn't operate game servers. One of the players hosts a game on their PC, and other players connect either through the WormNET listing service, or by punching in their friend's IP address and connecting directly to their PC. Players can even operate their own WormNET-compatible listing servers if they wish.
This means there's no trustworthy source of truth to settle disputes or cheating, since whichever player's PC acted as the arbiter could simply dictate in-game reality to their opponent(s).
So players' PCs need to be able to validate that competitors' randomized events are legal, but without generating the randomness early enough that a player could use a hacked game client to peek into the randomness future before taking actions.
I wonder if speculative calculation on the clients could be used.
For example let's say we're at frame 999 and have 2 players and 10 deterministic NPCs. And each player has 4 possible moves 2 of which have results depending on a random number being <= X or > X.
So each client calculates the next possible frame states knowing its user input (so it only has max 2 options depending on the random generator) * possible inputs from the other player (2 + 2*2 = 6 options), so there's 12 possible game states for frame 1000 from this client POV.
NPCs are deterministic so they can react to the player moves in each branch.
The other client does the same but has different variables missing and different state tree (with small overlap that only depends on random generator).
Then both clients send their player inputs to the server and wait for the missing variables to choose which branch of the state tree they both go. Depending on the game we may not wait just show the most likely branch and if we mispredicted we switch to the right branch after we get missing data.
This state tree could be several levels deep if there's enough computing power, and it could also be used by AI to choose best moves assuming what player can do (but then it gets complicated).
Advantages:
- server needs much less computing power than clients
- server can still verify the state
- clients don't have information they shouldn't have
- network latency is mostly hidden
Disadvantages:
- more computing power needed on the clients
- can get impractical quickly with more possible moves on each turn
You don’t need to simulate multiple futures just keep going with the most likely (continue doing what you’re already doing is a common heuristic) and correct when you get the missing info. This is the basis of deterministic rollback networking popular in fighting games.
I have poked and prodded at making my own turn-based game, and one thing I have learned is validation. Don't trust input, and you will make robust serverside code. Assume every bit sent by the client was maliciously sent in order to bring down your service in flames. Keep this mentality, and security becomes that much easier.
AJAX long polling is a good way to send data while avoiding spurious requests and/or delays, albeit with a bit more load on the server.
Seconded. The way I look at this is to always build the allowable action space from nothing, and default to nothing. Most of the time it involves a state machine where each node comes with a white list of currently allowable verbs a client can send.
The alternative is an infinite action soup, which opens opportunities for inspiration by lateral thinking. I'm doing it this way for a single player Lovecraftian text editor I'm working on. It's a terrible idea for multiplayer things.
Thank you whtrbt for linking my steam page. It's a pretty out of date I regret to say. I'm doing it from the woods and have to go a town over to the library to upload HD video.
For some reason it's easier to upload to twitter via tether, so it's where I've got the latest.
https://twitter.com/LeapJosh
For turn based games like board games, I make clients that don't have any concept of the game rules at all. The server generates a complete list of all available actions to the player and sends an array of all actions along with the game state. When you take an action the only payload the client sends is the index into this array. In this way the client has no way to express a cheat whatsoever.
There can (and must) be compromises between the cost of enumerating moves in messages from the server and the cost of validating client commands; command validation should be straightforward enough to be obviously correct and complete (leaving no room for cheating) and sufficiently cheap, not trivially simple.
For example, if the client in a wargame says it would like to move a unit (by name) from place A to place B on the battlefield the possible moves are implicit (units and locations are known) and validation is nontrivial but easy (the unit is at A and it can move, it can reach B considering fog of war, etc.). Far cheaper than sending 50 different moves for each of 50 different units and validating all of them preemptively every turn.
Indeed it isn't a general solution, it's just really good for almost all board games.
Your example is one that it's not the best for and you could use a different approach or amend the approach to take a payload for some actions and have some client smarts for those actions.
Although even in your example of 50 units with 50 moves each, having 2500 moves is certainly not a disaster in a game where the server only does work in response to a player's move and players don't take multiple moves per second and more likely take seconds or even minutes between moves.
You don't need to be so literal. An action like "game action XYZ" with a client smart enough to know how to show a text field and enable submission is sufficient.
The point is just every bit of logic about when an action is enabled must be implemented on the backend for verification, so spelling that out to the client reduces duplication.
In that case you would probably add a payload which you would then have to check, you're correct. It just turns out you can implement most board games without that.
I do still like this system for some games with actions that aren't convenient to enumerate; it's safe and simple by default and you add validation only in exactly the places where you need a payload.
English words are enumerable. There’s just a few million of them. The dictionary is large, but it’s not like it changes frequently, so you can treat it like a static asset that gets downloaded once and only updated periodically out of band.
The specific example may be faulty but the general point absolutely is not. I've built turn based games where enumerating the space of actions grew at a x b x c and would have easily measured in the megabytes.
It's just not a practical general solution, but it sure works fine for something like Monopoly.
About deterministic propagation model: "It relies on the assumption that processing the actions of a player is deterministic."
I am curious in what scenario of online gaming does this assumption fail.
I remember that about a decade ago there was a mobile game called Chaos and Order (like dota/League of Legends but was played on apple device). In one game, I heavily outplayed my opponent. After winning the match we added friends and realized that we actually experienced two totally different games - he lost it in my game but won it on his side, and my in-game actions made nonsense in his one. We called it a "ghost".
It is just funny to see how crappy the outcome would be if the deterministic propagation fails. Also in context such as system theory where error inevitably happens, the design shall make sure the extent of failure will not go unbounded.
This is called a "desync" and is a bug in a particular implementation of multiplayer networking. In this model instead of the server sending the state to each player, it sends their inputs, and allows the local device to completely determine the state of the game. If there are any differences at all in how the game's calculations are made, the game will have different states in different devices. Even different models of the iPhone may have tiny differences in how they calculate the square root for example, and over time these minor differences cascade into totally different games.
A common scenario where this approach fails: applying physics in realtime cross platform games. The floating point math involved is pretty much impossible to guarantee same results for. Deterministic physics engines do exist but they generally only guarantee determinism on the same OS/platform.
Attempted deterministic engine, no desync detection or reconciliation, and support for the whole PC ecosystem - I couldn't believe it when I realized what they were doing... pure madness (not in the good way).
I suspect it actually works fairly well for players on console, who aren't playing cross-platform (since they're working with basically the same hardware everywhere).
Yeah, I kept waiting to eventually hear that this was caused by some other random issue, and not them expecting deterministic behavior for floating point ops, but they actually confirmed it a few days ago in their blog: https://www.halowaypoint.com/news/closer-look-halo-infinite-...
The relevant section is:
"The first cause is rooted in the fact that the simulation needs to be deterministic. This means that when the client performs an action, the result of that action is the same when the server performs it."
Let's say you use JDK 17 (which re-specified float and double operations to be strict everywhere) and always use `java.lang.StrictMath` instead of `Math`, wouldn't you get determinism for all the floating point operations "for free"?
The very worst kind of determinism errors are butterfly effect like in that you don’t notice the desync until long after the cause has occurred. It’s pretty imperative if you’re building a deterministic system to add a layer of error checking right from the start so you can ensure each party ends up in the same state and identify desync very early. On the other hand you also want to identify where desync doesn’t matter (for example most FX) so you limit the scope of what has to be deterministic to keep things simple.
Imagine the same game being played on different platform ( different CPU... ) with different compiler etc ... it's possible that the physics does not behave exacly the same everywhere.
In my experience, most of the pain boils down to synchronization of state between participants. This is also your principal vector for cheating.
If you are willing and able to make a very extreme sacrifice regarding client-server architecture, it is feasible to consolidate all state to 1 synchronous domain while serving an arbitrary number of participants.
The sacrifice is basically to build your game as a native take on the Google Stadia model. I.e. inputs are raw player events, output is a low latency A/V stream back to player.
Keep in mind you don't have to always serve an x264 stream to all clients. Depending on your objectives and type of game, you can serve an intermediate view model that instructs the client on how to draw its view using local assets and resources (textures, models, gpus, etc). This would be more like how Blazor server-side model operates.
What you said is kind of complicated way of saying what I do.
Player client sends simple commands to the server. Server validates the command and updates game state.
The server then just simply listens for other clients to request the state of the world, at which point it sends a compete blob of data that includes everything that client can see!
The player can then interpret that data and send new commands as required.
My games are basically turn based so I don't need to poll the server constantly, just at the start of your turn and after every action.
There's a problem with this approach. If the client thinks it's in state X, and sends a command that's only valid in state X, but it turns out that the client was lagging and was actually in state Y, the server has to reconcile this discrepancy somehow.
For example, in an FPS you turn a corner, see an enemy and shoot them. But unbeknownst to the client a grenade exploded just beyond the corner killing the player before the shot got off.
Now we have to deal with the mess of the client thinking you just shot somebody while you were actually already dead. While it's true that this is "simply" a client-side issue, you'll hear endless complaints from players who, from their perspective, turned a corner and shot the enemy and then died to a grenade, but for some reason their shot didn't connect and to them it seems unfair.
Another scenario: An enemy client is approaching a corner. The server sends you their position because it estimates they'll continue moving beyond the corner. You see the enemy and shoot them. But then the enemy client's lagging input comes in and they actually stopped short of turning the corner! How do we reconcile the fact that you've seen and shot an enemy you shouldn't have been able to?
Sure, but these concerns do not apply so much to turn based games. In a turn based design all clients move in lockstep by virtue of the core game mechanics. Clients that send invalid commands will have them ignored by the server, and the client view will be updated to the current valid state when the client catches up.
Speak of the devil. I was just talking with someone about Neptune's Pride a few weeks ago and looked it up to see if it was still going. I really enjoyed that when it came out.
Have you ever written about the architecture of that and Blight anywhere?
I haven't because I really have no idea if any of it is any good. I just sort of muddled through it all.
I've been slugging through a re-write of the server from python to JavaScript because the Python code base is all out of support now and I expect Google to pull the plug on the 10 year old App Engine tools that its built on.
I also built a similar engine to boardgames.io using the Delta-update approach, which you can try at acos.games.
Deterministic lockstep is like trying to build a house with wet paper. You'll need to stack and compress millions of layers to make it solid.
Delta-update while not the best for bandwidth, is the cheapest for moving forward. I don't want to waste time fixing desync issues, when I can be creating more games and more features.
I'm still writing about my experience and showing exactly how I built my Delta-update networking model, but its working really well so far.
Nice article.
I built my own online version of a turn based board game because none exists at the moment. I just want to play with my family so I didn't bother coding a server at all, instead all clients communicate their actions to all others so I naturally implemented the deterministic approach. I haven't thought about secret state so it's interesting to see a possible solution for that. Fun!
I'm also writing an online version of a board game I like, to play with family and friends. I am about to start the networking part, and I actually intended to implement the deterministic approach, although I didn't know the concept had a name before reading the article. It's nice to read some validation that it is a good design.
I'm curious to know how you achieve not to need a server at all. How do you create the initial connection between clients without a server to negotiate NATs and things like that?
If you target desktop, should be fine, but mobile users tend to use cell towers which are on Symmetric NAT, which would require a TURN server on top of the WebRTC.
Even on desktop you run into people with weird configs. For example my router will not allow mDNS to work but my ISP NAT is pretty good so I can connect two machines as long as they’re not on the same network! In the end you almost always will need a relay to guarantee connectivity for any random pairing of participants in a large enough population.
The good news is that WebRTC is perfectly straightforward if all you want to do is to connect to a server on the open internet.
It is web-based, yes. I was aware of WebRTC but assumed I would need a least to host the server myself. I'm looking at the links you provided and they seem very interesting. Thank you.
Initially I thought I would be able to do it with webRTC but it turns out you still need some kind of specific server to connect clients.
Instead I cheated a little bit and used a messaging service https://ably.com/.
It was easy to setup and use, and their free plan is generous.
Another thing I didn't achieve is to find a RNG I could configure with a seed, so amongst my client I have a special one, the "host" who is drawing the cards, etc.
Overall it's been working really well. There is one thing that is particularly difficult without a server, that is debugging. Especially if using a mobile phone, you're in the dark when a bug happens. And with Ably and the free tier, you have to be connect in debug mode beforehand to see the logs, very unpractical.
Good luck with your game!
Not sure how far along you are, but I could help you build the game if you are willing to try making it for acos.games. I built a simulator that can help speed up development which is done in React and NodeJS.
Thank you for the suggestion. However I am close to finishing all the game mechanics, just on a single computer (the game has no secret knowledge so it playable on a single computer or, at worst, via screen sharing). Also I'm implementing it in Scala.js with Indigo [1], and part of the reason for the project is to have an excuse to learn Indigo, so i wouldn't change the technical stack.
One nice thing about turn-based games is that the logic for the game usually runs so fast that you can run it 'in-between' user events, so using a coroutine approach makes it pretty easy to write the game engine in a direct style with the rules explicitly coded in a straightforward fashion:
for turn in game:
for player in players:
action = await player.get_action()
engine.dispatch(action)
The nice thing is this approach supports remote proxy players - when you get an action from the local player you can send it out to all connected clients.
Very insightful thread and epitome of HN bringing out people with deep expertise on a subject.
Multiplayer video game networking is a standalone discipline in its own right. It feels like most Computer Science departments are producing candidates to work for BigTech co that are focused primarily with large scale data access/processing. But video games gets so specialized that it takes on its on form. To be more concrete, A.I./M.L. for finance is going to be different than video game A.I. Occasionally something like A* can be applied to a "real world" problem, precluding video games, but almost every video game programmer knows A*.
I'm curious, did you acquire your video game experience from working at Big Budget game, indie game, video game development school, self taught from special interest forums (if so, is there an equivalent of HN or Reddit), or a traditional CS education?
Because you want more than just the messages and the handshakes - you want to maintain consistency and if possible even recover from desync and other network weirdness - without allowing cheating.
Aren’t all of those exactly the features of a CRDT?
Regarding cheating and bad network, you still need all the same client side stuff and to define the conflict resolution policies into the CRDT, but the rest of the needed features (e.g. consistency, rewind, replays, etc) are all part of CRDTs.
Most games don't require merging of divergent state like in a CRDT: there's one, single, linear history of events mediated by the server. CRDTs are overkill for this use case, and something like optimistic concurrency control by the server is usually much easier to reason about and implement.
A similar problem exists in modern video codecs. You want to update the decoder with diffs based on some shared state. The problem is you don't really have "actions" you can do. You can say "start with this key frame" and then "this macroblock moved to here" but the fine detail of the change is likely unique and you need to send that diff explicitly.
Has anyone had success using off-the-shelf solutions that are popping up from AWS and Azure? Are there other good OTS options out there from other gaming-focused vendors for multiplayer servers and network layers?
The solution we used for managing secret state, when the state is from a random source, is to delay generating the secret information for as long as possible (so, e.g., if a card is drawn, the drawn card is decided at that moment, as opposed to pre-shuffling the deck at the start of the game). We experimented with some designs in which the secret is revealed only to the player who owns it, and the rest only see a verifiable hash of the secret until the player performs an action which reveals the secret; however, it doesn't solve the general problem that any time the game must immediately (without network latency) make a random choice, the design must compromise between either allowing a hacked client to predict the result (when the RNG output is stable) or manipulate it (when the RNG output changes very often, e.g. every frame).
Sadly, the newer games went for a simpler synchronization algorithm, in which if a discrepancy is detected, the entire state is simply copied over from one player's game to another. This does allow blatant cheating; I'm guessing it was a concession due to the new engine lacking in robustness or portability.