Consensus Part 2: Stellar’s Consensus Protocol

Tara Annison
9 min readSep 18, 2019

--

In part 1 (link) I explored two of the most common consensus mechanisms, Proof-of-Work and Proof-of-Stake. However many of the newer cryptos are introducing a host of new features e.g privacy, turing completeness, bigger block sizes etc, and some have also looked to introduce new consensus mechanisms. Whilst a risky decision owing to the fundamental nature this component plays within a crypto network, they do introduce some novel ideas as to how participants can be encouraged to reach agreement within the network and how they can be incentivised.

As such, within this piece I’ll explore another consensus mechanisms: Stellar’s Consensus Protocol.

However before we dive into this alternative consensus mechanism it is important to understand an underlying concept referred to as Byzantine Fault Tolerance (BFT). This is best understood as how a network of computers can reach consensus about whether a component has failed or not. This is important as in a distributed computing system as it may be possible for a failed component to appear both failed and functioning — whether done maliciously (in an attempt to confuse the system) or through error. This helps form much of the underlying work in creating a new consensus mechanism as any new model must be able to overcome a Byzantine failure.

The term ‘Byzantine’ in BFT is derived from, and best illustrated by, the Byzantine Generals’ Problem in which an army surrounds a city and must decide whether to attack or retreat. If the majority of the generals decide to advance then they will overthrow the city, however if the majority retreat, then any advancing generals will be killed and their attack will be fruitless. In a modern warfare situation, communications devices could be used to alert the generals whether they should attack or retreat, and in in the years before computers a messenger could run between to relay the message. However in both scenarios the messages could be tampered with or nefarious generals could look to relay incorrect information. As such, how then can we ensure consensus is reached and all generals either attack or retreat?

The BFT solution can be found where all loyal generals have a pre-agreed strategy in which default vote values are given to any missing messages. In practice this includes different versions of the ledger being sent around the network and voted on by a group of pre-agreed validators. When all the nodes have come to an agreement on the ledger state it is confirmed in the network and the voting begins again. If any nodes do not return a value then a pre-agreed value is assigned instead. A pure BFT method would be faster and cheaper than PoW but less decentralised as a central authority decided on the list of validators.

In comparison the PoW approach to byzantine failures is the below and achieves greater decentralisation through the addition of ‘work’:

1. The Generals agree the first plan received by all Generals will be accepted as the plan

2. A General solves the PoW problem, creating a block that is broadcast to the network so that all Generals receive it

3. Following receipt of this block, each General verifies and works on solving the next PoW problem, incorporating the prior solution into it, so that their plan adds on to the previous resolution

4. Each time a General solves a PoW problem, a block is generated and the chain begins to grow. In time, any General working on a different solution will switch over to the longest chain. This is the one most Generals are contributing to and therefore has the greatest chance of success

5. As the Generals know roughly how long a PoW solution takes to solve, after a set amount of time they will know if enough of the other Generals are also working on the same chain

(Source: Radix: https://www.radixdlt.com/post/what-is-proof-of-work)

Now onto an exciting new consensus mechanism …

Stellar’s Consensus Protocol (SCP)

The consensus protocol employed within the Stellar Network uses an enhancement of BFT called Federated Byzantine Agreement (FBA). This aims to be a decentralised version of BFT in which there is no centrally controlled list of validators and any new validator can join the network to help reach consensus. It is therefore referred to as an open membership network.

Consensus is reached through each validator choosing a list of other validators which it trusts. This list is referred to as a quorum slice. These quorum slices will overlap to create quorum and a network wide consensus will be reached when the nodes within the quorum reach a decision.

Let’s imagine we have 15 validators within the network, including; Adam, Brody, Cindy, Darren, and Ester. The below coloured outlines show the quorum slice for each of the named validators and the grey circles represent the other 10 validators within the network, however for simplicity their quorum slices are not shown.

Taking into account just the quorum slices for our named validators we can see that consensus will be reached by the nodes within the grey area (referred to as the quorum intersection).

The diagram above is heavily simplified to show how a quorum is created for just 5 validators but it is important to note that validators can have multiple quorum slices. However they must ensure their choices do not violate the quorum intersection as then it could be possible to have disjointed quorums which could come to consensus about different results. It is also important to note that in the above network, only the quorum slices for the named validators are being shown, when the additional 10 nodes have their quorum slices added, this may alter the quorum.

However let’s dig a little deeper into the voting. In SCP this is a technique called federated voting…

It’s first important to understand why a validator will add other nodes into their quorum slice. This is because they trust their vote. As such the nodes within a quorum will influence the vote of the validator:

Adam’s vote will be influenced by the 4 grey nodes in his quorum slice

Brody’s vote will be influenced by the 6 nodes in his quorum slice, including Ester

Cindy’s vote will be influenced by the 5 nodes in her quorum slice

Darren’s vote will be influences by the 8 nodes in his quorum slice, including Adam and Cindy

Ester’s vote will be influenced by the 6 nodes in her quorum slice, including Brody

When a decision is required to be made by the network, e.g whether to include a new transaction, there are 3 steps. It is key to understand that this is in fact as two stage voting process, with the first an initial vote and the second an acceptance vote.

Initial voting

Each node will put forward their initial vote. This is not a definitive accept or reject vote but instead a notice of their openness to support this as the result.

Darren may therefore vote that he is open to the result that a transaction is valid.

In doing so he must ensure that this is consistent with all statements that he has accepted (the second stage of voting) and will never accept a statement which contradicts this e.g the transaction is invalid.

Acceptance

As quorum slices intersect each other this provides an opportunity to influence other node decisions.

Imagine that Darren instead thought that the transaction was invalid and so initially voted as such. If the nodes in his quorum slices disagreed they can band together to form what is referred to as a v-blocking set and block action in each of the quorums which contain Darren — essentially telling him that they think he’s wrong.

As such, Darren can now place his second round vote to accept that the transaction is valid. However if Darren has already placed his initial and acceptance vote that the transaction is invalid, he will be unable to change it.

It is therefore possible for a node to place an initial vote and then accept a contradictory one.

Ratification

Once every member of the quorum reaches agreement e.g the transaction is valid then we say that the quorum ratifies the transaction.

However as mentioned previously, due to nodes being able to have multiple quorum slices, it is possible to have distinct quorums and therefore it is necessary to reach not only quorum consensus but also system wide consensus. This last step is called confirmation…

Confirmation

This is the final step in reaching consensus and is where each validator will broadcast their confirmation message of what result they have accepted. As such our named validators and all nodes in their quorum slice will broadcast accept(transaction valid). This can also have the effect that any nodes still in the initial voting or about to accept stage may decide to change their vote in order to agree. However as outlined above, any nodes which have already accepted a contradictory message e.g accept(transaction invalid) will be unable to change but any nodes who have initially voted accept(transaction invalid) and who have a v-blocking set accepting it, may change.

Once a sufficient number of confirmation messages have been delivered and processed, consensus is reached.

It is then important to consider some of the properties this consensus mechanism provides:..

• SCP tolerates Byzantine failures as it doesn’t require unanimous agreement from all nodes, instead it requires across a sufficient number. Furthermore and as out BFA contains the ability for non-responding or bad nodes to be ignored from the ratification stage — these nodes are referred to as befouled.

• As the network is open the number of validators will continue to grow as new nodes and new quorum slices are added to the network. As such the network has in-built scalability and will become more decentralised as more validators join.

• FBA is considered to be asymptotically secure as an increase in computing power would not improve a nefarious node’s ability to over-run the network. In comparison, PoW is susceptible to a 51% attack in which a node which can render 51% of the network’s computing power could (in theory) be able to enter and confirm fraudulent transactions. This was discussed in further detail within part 1. However in BFT and FBA models there is no cryptographic puzzle to solve and instead validators sign their vote with their private key. Therefore for a nefarious node to attempt to take over the network they would be required to guess the private key or the majority of nodes on the network. However for just one private key this could take 3 x 10⁵¹ years to crack — an impossibly long time! One susceptibility which FBA could suffer from is collusion between nodes, however in order to do so the colluded nodes would be required to infiltrate the majority of the quorum slices and as the network size grows the difficulty of this also grows. It is also notable that befouled nodes will be ignored and as such after a failed collusion attempt any known nefarious nodes will not be able to retry. BFT does not suffer from this susceptibility as there is a centrally controlled list of validators so nefarious actors can be removed from the list by the central authority.

In order to understand an additional property we must first explore a fundamental proof in distributed computing, FLP Impossibility. This states that any distribute system cannot obey all 3 of the following properties:

Fault Tolerance (The ability of the network to survive and reach consensus if a node goes down)

Safety (The guarantee that if a network agrees on a ledger it will not fork)

Liveness (The guarantee that the ledger will always eventually close and therefore the network will always be able to accept future transactions).

This is because the property of liveness includes the possibility for accidental forks and thus double spends. As such, and since SCP obeys the property of Fault Tolerance, it must therefore choose between Safety and Liveness.

PoW favours liveness over safety and as such we saw the creation of Bitcoin Cash on August 1st 2017. However FBA favours safety as an accidental fork could halt progress and therefore also disrupt liveness.

We can now explore the additional property of SCP’s FBA consensus mechanism:

• SCP has a lower latency than bitcoin’s PoW mechanism. In PoW miners must race to solve a cryptographic puzzle and thus add a block of transactions to the chain. This is incredibly energy intensive and takes c.10 minutes and as PoW favours liveness over safety you must wait 6 confirmations and thus c.60 minutes to ensure your transaction is within the winning chain. However in SCP, transactions are confirmed in c.3–5 seconds since there is no mining process just voting and confirmation message passing. Therefore by favouring safety over liveness there can be no accidental forks and thus double spends.

Stellar is of course more than a consensus mechanism, it is a new protocol offering an alternative to the existing centralised financial system and offers crypto-companies who are looking to ICO, an alternative to Ethereum. The network has a native token, Lumens, and looks to provide a cross-border solution for the under-banked as well as those institutions who want to innovate away from their existing legacy solutions.

To read more about Stellar head to their website. (https://www.stellar.org/)

--

--