Randomness as a concept

Consensus Layer: RANDAO

By: Ulysse Pavloff

With the Merge behind us, Ethereum is now a proof-of-stake protocol. Ethereum is composed of a consensus layer and an execution layer. Ethereum's new consensus is involved, so we will only explain the parts of the protocol relevant to randomness in the consensus layer.

Consensus Layer

In Ethereum proof-of-stake, the time is divided into epochs, themselves composed of slots. 1 epoch corresponds to 32 slots and one slot to 12 seconds.

Epoch 0 is the epoch containing slots 0 to 31; epoch 1 contains slots 32 to 63, etc. The slots are essential because they're the timeframe at which the consensus participants, the validators, are supposed to perform their duties. One of the duties possible is block proposer. For each slot, one validator is pseudo-randomly selected to be a block proposer. This makes 32 block proposers by epoch. Being a block proposer is a lucrative job. Thus, the protocol must ensure fairness when drawing the next block proposer. To do so, the protocol pseudo-randomly selects validators using a seed created with RANDAO.

RANDAO

The solution found by Ethereum for randomness in the consensus is RANDAO. RANDAO is a mechanism that creates ‘random' numbers in a decentralized fashion. It works by aggregating different random sources and mixing them. The idea is that by combining various sources, if at least one source is ‘‘truly random'', then the mix of them all is also random.

The value generated with RANDAO is used as seed to determine the block proposer of each slot at the beginning of the epoch. This seed is also used to determine other duties, but we will leave that aside. What we are interested in now is: how is the seed created and how is it used.

Seed creation

Each epoch produces a seed called RANDAO mix. This seed is created with the help of the block proposers of the said epoch. Each valid block contains a field called randao_reveal. It is mandatory for each block proposer producing a block to fill this field. The value of randao_reveal is then mixed with all the others, creating the seed. That is why the seed is also called RANDAO mix. If a block proposer were not to publish a block, the RANDAO mix would be made without him.

The mix in question is an exclusive or, also called XOR and noted ⊕. It takes bits as input and outputs a bit. Here are the four outputs for a XOR on inputs X and Y:

X	Y	X ⊕ Y
0	0	0
0	1	1
1	0	1
1	1	0

RANDAO mix is the XOR of all the randao_reveal published by the block proposers. Block proposers do not choose what randao_reveal they put in a block. They must sign specific data with their private key. Anyone can then verify that they signed the expected data with their private key. If it is not the case, the block is invalid. In practice, the cryptography used for this is the following: (Every operation is done modulo $n n$ , in a group $G \mathcal{G}$ of prime order $n n$ with a generator $g . g.$ )

Alice is owner of private-public key pair $(K_{p r i v}, K_{p u b}) (K_{priv},K_{pub})$ where $K_{p r i v} * g = K_{p u b} K_{priv} ∗ g = K_{pub}$ .
Her public key is selected to be the proposer of slot $i i$ .
During slot $i i$ , she creates a block and includes her RANDAO reveal by signing the current epoch: randao_reveal=BLS.sign(epoch of slot i) which mathematically corresponds to compute $y = K_{p r i v} * h y = K_{priv} ∗ h$ with $h = hash (e) h=\texttt{hash}(e)$ the hash of the current epoch.

With this, anyone can check the correctness of Alice's RANDAO reveal.

First step is to compute $h = hash (e) h=\texttt{hash}(e)$ .
Then verify that the following equality is true $h * K_{p u b} = y * g h ∗ K_{pub} = y ∗ g$ . If it is, then the signature is valid.

If you check in the specifications how it is done, you will see that they use BLS signatures. These BLS signature are similar to ECDSA signatures, but they are more efficient for aggregation. When many validators need to sign the same message, you can aggregate all of their signatures in one and check it efficiently.

It is now possible to verify n aggregated signatures on the same message m with just 2 pairings instead of n+1.

This property allows Ethereum consensus to possess many validators and make them all vote in the short span of an epoch.

Seed use

It is crucial to investigate how a seed is used because we want to use it to mimic randomness. We don't want to generate predictable behavior each time.

At the end of an epoch, you collect all the randao_revealto create the RANDAO mix. This RANDAO mix is then used as a seed to determine the role(s) of validators in a future epoch. The seed produced during epoch $e e$ will impact the duties of validators in epoch $e + 2. e+2.$

To determine the roles of validators, the protocol takes the list of all validators and shuffles them. After this, roles are given according to the new positions of validators in the list. For instance, the proposer of slot ss is the first proposer of the list shuffled with a seed equal to a combination of the RANDAO mix and $s s$ .¹

There is one core algorithm behind the shuffling; it's called compute_shuffled_index. Interestingly, it does not need the entire list of indexes because we do not compute the shuffling of the whole list. We apply the algorithm to a single index to determine which index will be swapped with.

Before showing the main algorithm, here is a simpler version that will help understand its functioning:

def swapOrNotSimplified(index, numberIndex, seed):
    pivot = hash(seed) % numberIndex
    newIndex = pivot + numberIndex - index % numberIndex
    return newIndex

This algorithm takes as input an index, the total number of indexes, and a seed; it then outputs the new index in the shuffled list. The first step of this algorithm is to take a pivot found with the seed's hash. The pivot is then used to compute the new index. Using this algorithm on all indexes of the list will give the shuffled list. One seed is used for one shuffle, therefore you get one pivot for one shuffle. It must be noted that indexes will be swapped two by two: for index = i, we will get newIndex = pivot + numberIndex - i % numberIndex. Now if we take index = pivot + numberIndex - i % numberIndex, then newIndex = pivot + numberIndex - (pivot + numberIndex - i) % numberIndex resulting in newIndex = i % numberIndex. This algorithm is thus swapping index and newIndex.

The algorithm used by Ethereum is almost the same. The difference is that it repeats the shuffling 90 times (SHUFFLE_ROUND_COUNT), and the swap is not systematic.

Here is the algorithm given by the specs:

def compute_shuffled_index(index: uint64, index_count: uint64, seed: Bytes32) -> uint64:
   """
   Return the shuffled index corresponding to ``seed`` (and ``index_count``).
   """
   assert index < index_count

   # Swap or not (<https://link.springer.com/content/pdf/10.1007%2F978-3-642-32009-5_1.pdf>)
   # See the 'generalized domain' algorithm on page 3
   for current_round in range(SHUFFLE_ROUND_COUNT):
       pivot = bytes_to_uint64(hash(seed + uint_to_bytes(uint8(current_round)))[0:8]) % index_count
       flip = (pivot + index_count - index) % index_count
       position = max(index, flip)
       source = hash(
           seed
           + uint_to_bytes(uint8(current_round))
           + uint_to_bytes(uint32(position // 256))
       )
       byte = uint8(source[(position % 256) // 8])
       bit = (byte >> (position % 8)) % 2
       index = flip if bit else index

   return index

Given an index, the total number of indexes, and a seed, the algorithm outputs the new index in the shuffled list. As in the simplified version, pivot is pseudo-randomly chosen. We've seen why it is called a “pivot”; flip and index are equally distant from pivot/2. The last step is to change the value of index according to a pseudo-random bit. After repeating this SHUFFLE_ROUND_COUNT times, the result is the new index in the shuffled list.

Randao Attack

Although RANDAO seems to ensure that the RANDAO mix produced is random, it remains manipulable. It is susceptible to what is called the “last revealer attack”. In the way RANDAO works, the last participant has more information than the previous participants. The last participant knows what the RANDAO mix will be, if she participates or not. A rational participant could choose not to publish a block because the RANDAO mix favors her more as it is. For instance, if Alice is the last block proposer, she observes that with the current RANDAO mix she will be the block proposer 4 times in a future epoch, whereas if she adds a block and changes the RANDAO mix, she will not be the block proposer at all. This is one such situation where the last block proposer has incentives not to follow honestly the protocol.

Conclusion

RANDAO has shown us a new way to produce randomness in a distributed manner, revealing some downsides of its use. If we look beyond the ‘last revealer attack', there remains the problem that once the seed is chosen, everyone knows the future role, and validators could be prone to DoS attacks. Polkadot and Algorand use VRF to select block proposers pseudo-randomly in an unpredictable manner. This alternative also has its downsides, such as the possibility of two block proposers being selected at the same time, which can lead to temporal forks. Some advocates for the use of VRF in Ethereum Proof-of-Stake; we'll see what the future holds.

Notes:

The actual selection also takes into account the stake of each validator.

I don't know if I should, just gonna read one page...