One post tagged with "consensus"

Consensus Protocols: Tendermint and pBFT

July 21, 2018 · 7 min read

The consensus problem in distributes system is an age old problem. A strong form of consensus problem can be defined as follows:

Given a set of processors, each with an initial value:
All non-faulty processes eventually decide on a value
All processes that decide do so on the same value
The value that has been decided must have proposed by some process
These three properties are referred to as termination, agreement and validity. Any algorithm that has these three properties can be said to solve the consensus problem.

Now these faulty processors can have 2 types of faults.

Crash faults - Where faulty processes just stop. They don't act any further.
Byzantine faults - In this case, we don't assume any thing about the faulty processes. These processes can behave aribitrarily. They can send wrong message, correct message to some and false message to some, lying, deceiving, anything is fair game.

If the processes only have crash faults, then achieving consensus is relatively easy. We can be sure that all messages we get are from correct processes as the processes which are faulty don't send any messages. Systems which only tolerate crash faults can operate via simple majority rule, and therefore typically tolerate simultaneous failure of up to half of the system. If the number of failures the system can tolerate is f, such systems must have at least 2f + 1 processes [1].

While if the processes can be Byzantine, they can send incorrect messages or correct messages to some and incorrect messages to others.Byzantine nodes are special in the sense that they can do any arbitrary thing (lie, deceive, etc). This lack of any assumptions about the nodes is very powerful and this is the reason why this problem is so interesting.

For solving for consensus in presence of crash faults, simpler algorithms like Raft and Paxos work. But these algorithms don't work in presence of Byzantine faults.

The problem of consensus with Byzantine Faults is discussed in Leslie Lamport's paper on Byzantine General's Problem. A solution for this was also proposed by Lamport, a good discussion about which can be seen here.

The problem with this algorithm is that it's a very costly algorithm. Leslie Lamport's solution to Byzantine General Problem requires O(nm+1) message transmissions where n is the total number of nodes and m is the number of byzantine nodes such that n>3m.

Practical Byznatine Fault Tolerance

Practical BFT is a consensus algorithm proposed by Castro and Liskov which solves Byzantine General's Problem in a more efficient way. Practical BFT requires O(n2) messages to achieve consensus in presence of Byzantine processes.

pBFT involves 3 stages in normal case operation

Pre-prepare
Prepare
Commit

Screen-Shot-2018-07-21-at-7.03.46-PM Fig 2. Normal case operation of pBFT

pBFT algorithm (as described in the original paper) solves for consensus in case of classic distributed systems. Though this has been adapted for blockchain based systems like in case of Hyperledger Fabric and Tendermint.

One important thing to keep in mind is that pBFT based consensus works only in case of permissioned networks - where the identity of nodes is known. Anyone can't just join these network. The operation of the algorithm is based on the identity of nodes being known.

Blockchain helps in achieveing consensus in distributed systems as it groups transactions in blocks in order to amortize the high commit latency (on the order of ten minutes) over many transactions. Also, linking blocks via cryptographic hashes into an immutable chain, makes it easy to verify the historical record (via Merkle proofs).