The Byzantine Generals Problem (1982) [pdf]

arberavdullahu · on Feb 13, 2020

Taken from [0]: There is a problem in distributed computing that is sometimes called the Chinese Generals Problem, in which two generals have to come to a common agreement on whether to attack or retreat, but can communicate only by sending messengers who might never arrive. I stole the idea of the generals and posed the problem in terms of a group of generals, some of whom may be traitors, who have to reach a common decision. I wanted to assign the generals a nationality that would not offend any readers. At the time, Albania was a completely closed society, and I felt it unlikely that there would be any Albanians around to object, so the original title of this paper was The Albanian Generals Problem. Jack Goldberg was smart enough to realize that there were Albanians in the world outside Albania, and Albania might not always be a black hole, so he suggested that I find another name. The obviously more appropriate Byzantine generals then occurred to me.

[0] https://www.microsoft.com/en-us/research/publication/byzanti...

mikorym · on Feb 13, 2020

The original is a well known problem in logic which my lecturer touched on. If you have each general on a mountain and a valley between the two, then you have no way to know whether the other received the message. If you send back a confirmation, then you have no confirmation confirmation... etc.

In my (at the time occasional) drunken state I may have thought every now and a again, after 13 confirmations of confirmations surely you can just attack! But alas, it all falls flat: You have no way of knowing the final confirmation, the one before that, etc. until you are back where you started.

By the way, this only applies to lossy communication without feedback. A Whatsapp text syncronises via a server. As long as the server is up and confirms both sides, you can be quite sure the other side received the message. This is manifested these days in the form of the infamous blue ticks.

peheje · on Feb 13, 2020

How does a server fix the problem?

The server still doesn't know whether the sender knows that the server received and acknowledged your message, so you might want an ack from server, but again, the server doesn't know if you received the ack, so you ack the ack, but again ...

Supermancho · on Feb 13, 2020

Using a server means the problem isn't event the same problem.

The General's Problem (as presented in the comments) is trivially solvable by multiple messengers WHO COME BACK to report the message delivered. Wherein, the messenger is an encrypted messenger and the general has the key (knowing which messengers were sent out).

In an ephemeral message scenario, you have to rely on the same idea, using an encrypted log of messages that both generals have access to (since they are both "on the same team", there is already coordination).

This philosophical exercise isn't practical, from the setup provided. I don't see the point in trying to solve for it without adding a ton of additional constraints, by which you can reach an optimal or no-solution.

neffy · on Feb 13, 2020

Fundamentally this is a Fisher consensus issue. To solve any Fisher consensus issue, all you need to do is create a synchronisation point - which is what the server becomes.

zaarn · on Feb 13, 2020

Basically, the server assumes that if the client has ACK'd the message, it can continue sending data. The client assumes that the server has received their ACK if it continues sending data.

If no data is being sent, the connection is closed via timeout eventually.

peheje · on Feb 13, 2020

I am talking about the Two General's problem which I thought original commentator was referring to as the original problem.

In this situation you are describing a server that wants to send a client a message, and both parties need to be sure that they are in agreement when to attack a common enemy they can't deal with alone because the message contains the attack plans and timestamp for attacks.

Wikipedia has a good explanation:

"The first general may start by sending a message "Attack at 0900 on August 4." However, once dispatched, the first general has no idea whether or not the messenger got through. This uncertainty may lead the first general to hesitate to attack due to the risk of being the sole attacker.

To be sure, the second general may send a confirmation back to the first: "I received your message and will attack at 0900 on August 4." However, the messenger carrying the confirmation could face capture and the second general may hesitate, knowing that the first might hold back without the confirmation.

Further confirmations may seem like a solution—let the first general send a second confirmation: "I received your confirmation of the planned attack at 0900 on August 4." However, this new messenger from the first general is liable to be captured, too. Thus it quickly becomes evident that no matter how many rounds of confirmation are made, there is no way to guarantee the second requirement that each general be sure the other has agreed to the attack plan. Both generals will always be left wondering whether their last messenger got through. "

There is no solution, only a decreasingly smaller change of disagreement e.g. by sending many messages or keep acking each other x amount of rounds (there are other strategies).

No timeouts, only death.

zaarn · on Feb 13, 2020

The obvious solution is to bring a ham radio.

belinder · on Feb 13, 2020

What if they send a message that general 1 is going to fire a flare at 0900 and if general 2 fires one too then they both attack. Basically make the confirmation something that you can get without relying on the same method of communication

pfarrell · on Feb 13, 2020

A single line of communication is a constraint of the problem. And anyway, how does general 2 confirm the answer flare was seen?

pstuart · on Feb 13, 2020

With a reciprocal flare :-)

celticninja · on Feb 13, 2020

You are just back to the same problem, you need to trust a server. The server is the valley between the generals, you can't be sure what happens there. Do they get the message, or do they het an altered message, both of which will send blue ticks to indicate it has been read.

zzbzq · on Feb 13, 2020

The server is also a general on a mountain

Retric · on Feb 13, 2020

Using a server just moves the problem to a different channel it does not remove it.

You can design a practical protocol between two parties that actually works based on a large number of messages. Let’s say the generals can exchange 100 messages each of which has an independent 80% chance of message failure. That sounds really bad.

However, if the protocol is I will attack after receiving 20 messages or I will flip a coin once for each message received and attack on the specific date if even a single head shows up.

Now in your example if they receive 13 conformations they can assume there is at minimum a 99.98% chance the other party is going to attack.

There are several ways to improve the protocol based on a host of factors. But, it really demonstrates the difference between theory and practice.

kovek · on Feb 13, 2020

I don’t see why they can’t agree after a second confirmation? Let’s say A first tells “7am” to B and after that they send eachother ACK. Let’s say this happens:

A -1-> B

A <-2- B

A -3-> B

A <-4- B

If the rule is that the general sends a message back when he receives one, then, these 4 back and forth that I showed are only possible if this HAS happened:

A —1-> B

A <-2- B

A -3-> B

So we know that these 3 back and forth have happened. Also B knows that these 3 back and forth have happened, and A knows that 2 back and forths have happened. B knows B got the message because B got the message. A knows B got the message 1 because A got a confirmation (message 2).

This is difficult for me but I convinced myself here. Please anyone prove me wrong!

jayd16 · on Feb 13, 2020

If A does not receive 4, it cannot know B received 3. If B did not receive 3 it can assume A did not receive 2. If A did not receive 2 then B should assume the attack is off If A thinks B is not attacking then A should not attack either.

umvi · on Feb 13, 2020

But in this scenario A did receive 4, so it can assume B received 3. And in this scenario B received 3.

Presumably A will keep sending 3 over and over using a timeout until it receives 4.

jayd16 · on Feb 13, 2020

I think the major issue is in a design that relies on a final message being lost causing a decision change. It ends up unraveling everything. TCP works because the receiver accepts that the sender might not get an ACK.

dwohnitmok · on Feb 13, 2020

And what happens if B attacks before A receives the 4 (B did receive the 3 after all)?

umvi · on Feb 13, 2020

A attacks too, since, being a thinking human knows that he previously received 2.

I think most humans would attack with a reasonable degree of confidence if they had gotten at least one ack in the past

dwohnitmok · on Feb 14, 2020

It's a bit hard to express this with plain English so maybe it makes sense to just write out the pseudo-code.

A: attack if acks from B > 0

B: attack if acks from A > 0

Let's look at the current

A --> B : Original message (not an ack)

A <-- B : Ack from B (A is now committed to attacking)

A -/> B : Ack from A is lost, B does not attack

Okay let's change it up

A: attack if acks from B > 1

B: attack if acks from A > 0

A --> B : Original message (not an ack)

A <-- B : Ack from B (A is not yet committed to attacking)

A --> B : Ack from A (B is committed to attacking)

A </- B : Ack from B is lost, A does not attack

The essential problem (which I failed to explain correctly) is that while you can always go from a fixed set of interactions and work out a protocol that would've worked for that particular interaction, you can't go the other direction (starting with a protocol and throwing arbitrary interactions at it).

burroisolator · on Feb 13, 2020

Mentally freeze time at when A receives 4 and pretend you are B. When A receives 4, you don't know yet whether A received 4. Hence, you can assume the worst and assume that A didn't receive 4. You know that A has sent 3. However, you can assume the worst and assume that A thinks that A has failed to send you 3. After all, you don't know that A has just received 4. You know that A has received 2 because you received 3 from A. But does A know that you know that A has received 2? As we said earlier, you think pessimistically that A thinks A has failed to send 3 to you (even though you did receive 3!). And so you think that A doesn't know that you know that A has received 2. In other words, you think that A thinks that you think you have failed to send 2. And so on.

I found it helpful to draw it out. Draw a human with the letter B on their T shirt holding a letter labelled 3 and create a cartoon thought bubble on top of their head. Inside that thought bubble write the text "I wonder if A got my letter 4 but assuming they didn't this is how I picture them..." and draw a human with the letter A on their T shirt holding a letter labelled 2 and create a cartoon thought bubble on top of their head.... The final thought bubble should say "I wonder if B got my letter 1 but assuming they didn't I will not attack."

praptak · on Feb 13, 2020

The formal proof just takes the shortest exchange that leads to the correct result and drops the last message. Since the sender doesn't know it failed, the result must be incorrect.

That's the gist of it, there are some technicalities of specifying what "correct" means.

ip26 · on Feb 13, 2020

All well and good, but let's say the ACK never comes. Then what? Did the transmission get lost, or was it the ACK that got lost?

LeifCarrotson · on Feb 13, 2020

This applies to lossy communication with or without feedback and with or without a server. Feedback is just another layer of confirmation messages.

If the server is up and sends a confirmation message to one side, but the other side disconnects before receiving the confirmation message (or before sending the acknowledgement that it received the confirmation message), you have another Byzantine Generals problem.

With a server, it's just a connection of 3 generals with the server in the middle, instead of a peer-to-peer connection.

jfkebwjsbx · on Feb 13, 2020

A server solves nothing. You can never be sure if the other person has seen that you have seen...

jayd16 · on Feb 13, 2020

It doesn't solve it. You can send a message and kill the connection before you get the ack or the read receipt and they'll still get the message.

hprotagonist · on Feb 13, 2020

well, now you trust the server. Who might betray you :)

SilasX · on Feb 13, 2020

Scott Adams created Elbobia as a fictional country, with both Eastern European and middle eastern traits, so he could make fun of a backward country without offending anyone.

Interestingly, it feels like a cross of Estonia and Albania.

gitgud · on Feb 13, 2020

I'm reluctant to broach the subject, but is a "Blockchain" based system a solution to the "The Byzantine Generals Problem"?

> A. All loyal generals decide upon the same plan of action.

All actors arrive at an immutable consensus (the current transaction group in the blockchain).

> B. A small number of traitors cannot cause the loyal generals to adopt a bad plan

It has protection against a small number of bad-actors as they require proof of work/stake which prevents against bad-transactions up until 51% attacks.

I don't have much knowledge in the area though, any thoughts?

d10 · on Feb 13, 2020

The authors of this paper didn't have "blockchain" in mind. Imagine rockets, or airplanes, with redundant systems on board. If, say, 3 of 4 sensors report "lower the landing gear" while 1 of 4 reports "don't lower the landing gear", then a hard requirement of the Byzantine Generals Problem is that "lower the landing gear" wins, assuming a majority of sensors function correctly (loyal generals).

Nakamoto consensus addresses "double-spend". Say I have exactly one coin, and I broadcast a transaction that sends the coin to Alice. All miners on the network consider my transaction a candidate for the next block, and get busy hashing proof-of-work.

Moments later, I compose a transaction that sends my one coin to Bob. And, I start mining a block with only that transaction in it. Now, its a race. If I win the race, the coin goes to Bob. If anyone else wins, the coin goes to Alice. Importantly, the coin will not go to both Alice and Bob.

In Nakamoto consensus, exactly one miner wins each race. And that miner, "loyal" or not, determines whether the coin goes to Alice or to Bob. To satisfy Byzantine Generals, in this particular example, the coin should always go to Alice. But in Nakamoto consensus it might end up going to either Alice or Bob.

Solving double-spend in a network of untrusted nodes is a significant feat. (And, according to the market, extremely valuable.) However, as I think your question is designed to point out, Nakamoto consensus does not solve the Byzantine Generals Problem. No offense to Nakamoto, it solves a different problem.

It's worth mentioning the concept of "byzantine faults" - meaning a node on the network might not be simply buggy, it might be designed by a malicious adversary to do any behavior that could exploit vulnerability in other nodes. So, if you're reading the next revolutionary ICO whitepaper and see, "we use Nakamoto Consensus to solve double-spend, even in the presence of byzantine adversaries", maybe keep reading. But if you read, "we use proof-of-work to solve the Byzantine Generals Problem", maybe save your money for another investment.

exdsq · on Feb 13, 2020

Yes. There are several BFT compliant conensus algorithms for both Proof of Work and Proof of Stake. Some of them, such as Bitcoin, are BFT compliant because it's economically infeasible to complete an attack. Ouroboros is BFT compliant up to n/2-1 bad actors https://eprint.iacr.org/2018/1049.pdf

Edit: This goes back a comment I made the other day about 'Blockchain == Down votes' on HN. Why would you down-vote OPs question?

riffraff · on Feb 13, 2020

I am quite sure Satoshi Nakamoto suggested it as a probabilistic solution, yes, but not a full one[0]

[0] https://ethereum.stackexchange.com/questions/40213/how-is-th...

zzzcpan · on Feb 13, 2020

The problem doesn't have a "full solution", like any fault tolerance in general, only probabilistic one.

Kiro · on Feb 13, 2020

I thought that was the point of this thread since Byzantine is more or less a blockchain term at this point. Bitcoin maximalists that want to sound smart always refer to "Byzantine fault tolerance". Case in point: https://news.ycombinator.com/item?id=22296659

"I was just thinking about how you said you didn't like Bitcoin and imagining ways that it could be used to improve Wikipedia. More generally, Bitcoin can help any system to become more Byzantine fault tolerant."

remcob · on Feb 13, 2020

As it says in the abstract: "With unforgeable written messages, the problem is solvable for any number of generals and possible traitors". See also chapter 4.

As soon as you allow digital signatures, you have unforgeable messages and Byzantine Generals is solved. No need for consensus algorithms, proof of work, or blockchains.

SilasX · on Feb 13, 2020

I don’t think that’s correct. You can still have signed-but-contradictory messages, ie the double spend problem where the signatures are valid but promise the same money to two different people (or promise different attack strategies).

I think that statement would be correct if changed to “unforgeable messages with all same-side generals honest (and un-confused).

remcob · on Feb 13, 2020

In the Byzantine Generals Problem as stated in the paper, there is a single commander that decides on the plan. Paricipants known the identity of the commander in advance. In this case signatures are sufficient.

The problem you are describing is a different one, where there is no single commander and a group of generals needs to collectively decide on a plan. There signatures are indeed not sufficient. The way proof-of-work solves it is by randomly electing one of the generals to become commander for this decision round (i.e. 'it mines the block'). The random election is somewhat imperfect, leading to second order complexities such as probabilistic finality.

SilasX · on Feb 13, 2020

Ah, okay -- I was thrown off by the phrasing in your second paragraph, which was assuming the same things the paper did. To make the context explicit, one could phrase it like, "As soon as you have digital signatures and centralization, then you don't need blockchain." (Edit: which, correspondingly, debunks the numerous proposed uses of blockchain where they can assume centralization.)

remcob · on Feb 13, 2020

Good point. I find that carefully phrasing the problem is half the solution. For example, the entire field of cryptography would not have moved beyond the one-time-pad if it wasn't for a rephrasing from 'impossible' to 'computationally infeasible'. Similarly with probabilistic algorithms.

dooglius · on Feb 13, 2020

Yes, you can read it from the horse's mouth: https://www.metzdowd.com/pipermail/cryptography/2008-Novembe...

celticninja · on Feb 13, 2020

That is literally what bitcoin is. A cryptographic electronic solution to the Byzantine generals problem. Given that almost all all other cryptocurrencies are forked from bitcoin, they too have this capability.

fouc · on Feb 13, 2020

What I'm curious about is all the so-called "blockchain" tech that banks and others are implementing.

If it's not a solution to Byzantine Generals Problem, and not de-centralized/distributed, then it can't be called a blockchain.

So I'm wondering if that's the case, or if they really are solving BGP while still being largely distributed?

hippich · on Feb 14, 2020

Bitcoin is blockchain + POW. Git is blockchain. Chain of blocks of information where each consecutive block refers to the previous one (and one block can have multiple descendants, although this is not applicable to bitcoin) - is blockchain.

fouc · on Feb 18, 2020

Git isn't considered to be a blockchain actually.

ilammy · on Feb 13, 2020

By the way, one of the paper authors – Leslie Lamport – is the Leslie Lamport, creator of LaTeX. He is probably one of the reasons behind beautiful typesetting of this and other papers.

exdsq · on Feb 13, 2020

I always associate Lamport with distriubted systems. I forgot he did LaTeX, I think about TLA+ and Lamport clocks!

empath75 · on Feb 13, 2020

and paxos!

exdsq · on Feb 13, 2020

The mans a genius

namdnay · on Feb 13, 2020

The guy who had a bakery?

dang · on Feb 13, 2020

A thread from 2018: https://news.ycombinator.com/item?id=17702640

2014: https://news.ycombinator.com/item?id=8697029

Others?

(These links are just for curious readers; reposts are ok after a year or so—see https://news.ycombinator.com/newsfaq.html)

mlazos · on Feb 13, 2020

This brings back memories, I worked on a system that had to be fault-tolerant during an internship and this paper gave me a lot of insight into why some design choices (ie the number of processors) were made.

professionalguy · on Feb 14, 2020

Can someone ELI5?

vinniejames · on Feb 13, 2020

Bitcoin fixes this

inviromentalist · on Feb 13, 2020

My most frustrating thing about the crypto movement was how uneducated people were.

People had no idea that POW isn't scalable, yet would insist that a Bitcoin copy-paste would solve the problem. Or that a centralized coin was the future. Or that proof of stake was a working solution.

I still hear people wrong today. I only imagine that if people understood, there would be less altcoin mania.

lukifer · on Feb 13, 2020

I view the PoW problem as being a subset of a broader problem with otherwise-efficient markets: there are often game-theoretic incentives to be intentionally inefficient. An example from nature is a peacock's tail, wasting resources to signal genetic fitness [0]; an example from consumer products is the Veblen Good [1] (aka conspicuous consumption), where wealth is wasted to signal status.

At minimum, I think we need a carbon tax (ideally paired with a revenue-neutral dividend), which might alter the incentives for crypto mining. But I don't think it would be unreasonable for industrialized nations to ban useless PoW operations outright. While I've read cogent explanations of PoS having game-theoretic shortcomings relative to PoW, given the growth of bespoke ASICs and the capital requirements of modern mining operations, PoW seems to me to be indistinguishable from PoS with extra steps, while also being an ecological nightmare.

[0] https://en.wikipedia.org/wiki/Signalling_theory#Sexual_selec...

[1] https://en.wikipedia.org/wiki/Veblen_good