Randomness as a concept

Execution Layer - VRF

By: Ulysse Pavloff

Producing true randomness with deterministic machines such as computers requires data from the outside world (cf. Introduction to Randomness). Therefore, the question of randomness is all the more complex in blockchains, systems with seemingly no connection with the outside world. We will explore this issue in the Ethereum blockchain. The Ethereum protocol comprises of 2 layers: the execution layer, which resembles Ethereum before the Merge without proof of work, and the consensus layer, which previously was the beacon chain. We will focus our search of randomness in the execution layer.

On-chain randomness

How to find randomness in the Execution Layer? Each block header is filled with information one could consider as a possible seed for pseudo-randomness. Before the merge, you could not precisely predict the future timestamp, even with the small timeframe to publish blocks. Since the merge, blocks are virtually all 12 seconds apart each time, making the timestamp useless for randomness. A more viable option is the difficulty field. This refers to the difficulty of mining the block. The difficulty field was used for randomness by developers, but the merge has rendered it deprecated. It is now set to zero since there is no more proof of work. Another possible source of randomness could be the block's hash, but none of the block header fields 'truly' works. This is because every data stemming from a block is manipulable by the miner producing it. A miner can also generate a block and not propagate it if the randomness that is produced from its data is not in its favor.

For instance: let's say that a smart contract plays heads or tails to determine who will gain a huge amount of money (largely superior to the block reward). The money is to be given to one of two persons: user A or user B. The smart contract uses the randomness provided by the block's hash to pick the winner. If user A is chosen to propose the block that the smart contract will use, then he will not propose a block with a hash that benefits user B.

Even if on-chain data can be manipulated, EIP-4399 has made a source of randomness accessible. In the header, the filed mixHash, associated with proof-of-work computation, has been replaced by prevRandaoto include randomness originating from the consensus layer¹. The opcode (0x44) will be renamed PREVRANDAO and be a source for on-chain pseudo-randomness. But as the Ethereum Fondation states:

This opcode will thus be a stronger, albeit still biasable, source of randomness for application developers

In short, no on-chain data is impervious to manipulation. Our gaze naturally turns to off-chain randomness.

Off-chain randomness

When we look at off-chain data, we naturally look at Oracles. Oracles have noticed the necessity for an unbiased source of randomness. To solve this problem, Oracles have stepped up and proposed a solution. At first glance, you could think that using Oracles is a way of delaying the problem of randomness to a third party. And indeed it is, but you don't need to trust that third party, and this is a crucial advantage.

Oracles such as Chainlink proposed using VRF or, Verifiable (pseudo-)Random Function. As its name implies, VRF are functions used to produce pseudo-randomness in a way that anyone can verify whether the output has been tampered with or not. This is a solution that provides unbiased randomness.

A Verifiable Random Function works as follows: given an input message m, it outputs a pseudo-random value and a proof that this output value was computed using m.

The main idea behind VRF is that using your private key to sign something produces an unpredictable signature. This is the pseudo-randomness part. If we can prove that we signed what we were supposed to sign and with the right private key, it should be enough to convince anyone that the data produced is pseudo-random and not forged. This is the verifiable part.

The VRF thus allows the owner of a private-public key pair to produce a random value and the proof associated. Anyone knowing the cryptography behind it can then check using the proof if the pseudo-random value was correctly generated. Let's dive into the fun part: how it works!

How VRF actually works

(Every operation is done modulo $n n$ , in a group $G \mathcal{G}$ having a generator $g g$ of prime order $n n$ .)

In order to stay away from biased data from malicious miners, a solution proposed was the Verifiable (pseudo-)Random function. We will go through the steps for Alice to generate a verifiable (pseudo-)Random value.

Alice has a private and public key pair $(K_{p r i v}, K_{p u b}) (K_{priv} ,K_{pub})$ where $K_{p r i v} * g = K_{p u b} K_{priv} ∗ g = K_{pub}$ .

Given a value m, Alice hashes it, $h = hash (m) h=\texttt{hash}(m)$ , to generate the pseudo-random value $K_{p r i v} * h = y K_{priv} ∗ h = y$ .

To prove that this value was indeed generated by her and corresponds to the product of her private key and the hash of the message, Alice proceeds in two steps:

She choses a random value $r r$ and computes $s = hash (g, h, K_{p r i v} * g, K_{p r i v} * h, r * g, r * h) . s = \texttt{hash}(g, h, K_{priv} ∗ g, K_{priv} ∗ h, r ∗ g, r ∗ h).$
Then $t = r - s * K_{p r i v} . t = r - s ∗ K_{priv}.$

With this, she generates the associated proof $π = (s, t, y) . \pi=(s,t,y).$

With the proof $π \pi$ , anyone can check that #lang:latex#y was computed using the private key of $K_{p u b} K_{pub}$ .

To do so, they have to compute:

$s^{'} = hash (g, h, K_{p u b}, y, t * g + s * K_{p u b}, t * h + s * y) s' = \texttt{hash}(g, h, K_{pub}, y, t ∗ g + s ∗ K_{pub}, t ∗ h + s ∗ y)$

If $s^{'} = s s' = s$ then #lang:latex#\pi proves that #lang:latex#y was computed correctly by $K_{p u b} K_{pub}$ .

This proof is convincing because the equality implies that:

$r * g = t * g + s * K_{p u b} r ∗ g = t ∗ g + s ∗ K_{pub}$
$r * h = t * h + s * y r ∗ h = t ∗ h + s ∗ y$

With 1 we are sure that the second step of the proof was executed correctly. With 2 we are sure that $y y$ is the message $h h$ signed with Alice's private key.

Usecase - PoolTogether

A famous use-case of a protocol using randomness with VRF is PoolTogether. PoolTogether was constructed to be a no-loss prize game. It's a decentralized protocol that looks like a lottery with no losers, only winners. To participate you deposit money in a pool of the protocol. All the money deposited goes through various DeFi protocols generating interests. The interests are then randomly distributed to one participant. Everyone gets a chance to win the interest without ever losing the deposited money. One participant reaps all the benefits, but none lose.

The ideal, of course, is finding things that are both exciting and meaningful. @PoolTogether_ looks like a lottery, but at the same time it's a very creative psychological gadget that could improve people's ability to have savings.- vitalik.eth (@VitalikButerin) July 1, 2020

Of course, in this example we are interested in the random selection that occurs to draw a winner. At its beginning, PoolTogether was handling randomness internally. This meant that every participant had to trust the team to remain impartial and not interfere with the randomness for their benefit. The team investigated various ways to get randomness, but all were flawed. We've seen that miners could bias on-chain solutions, and off-chain solutions shift the problem to trusting a third party.

The chosen method appeared thanks to Chainlink. Currently, the random selection in PoolTogether is made using Chainlink's VRF. We've seen how it works in theory; let's see how it works in practice.

To benefit from Chainlink's VRF you need to pay with Link (Chainlink's token) each time you request a random number. You send a request for randomness, providing an initial message or a seed. Then Chainlink uses the VRF with the seed provided to generate randomness and its associated proof. The contract in charge of the proof verification is filled with insightful comments to describe each step. In VRF.sol, you can find all the steps as described in the technical part above. Variables' names are changed following this paper.

You should be able to understand it all now that you know the cryptography behind it. As the code mentions, in addition to the cryptography we discussed, they use Vitalik's idea to save gas. This technique allows making some multiplication off-chain without losing any security.

Conclusion

The importance of the solution brought by Chainlink is illustrated by the numerous case studies that use its VRF. VRF appears to be the safest and soundest way to manipulate pseudo-randomness. As always, a wonderful solution brought by cryptography 👍

Notes:

We will explore the randomness in RANDAO in the following article.
As mentioned in this paper, "for security $r r$ must be (pseudo)random and secret; it can, in particular, be a pseudo-random function of $K_{p r i v} K_{priv}$ and $m m$ .

Resources:

Micali, Silvio, Michael Rabin, and Salil Vadhan. 1999. Verifiable random functions. In Proceedings of the 40th Annual Symposium on the Foundations of Computer Science (FOCS \`99), 120-130. New York: IEEE Computer Society Press.
VRF Chainlink: https://blog.chain.link/chainlink-vrf-on-chain-verifiable-randomness/
Case study Chainlink: https://chain.link/case-studies/pooltogether
Chainlink VRF: https://blog.chain.link/verifiable-random-function-vrf/
Smart billion hack: https://www.reddit.com/r/ethereum/comments/74d3dc/smartbillions_lottery_contract_just_got_hacked/
Randomness inside Ethereum PoW: https://securityboulevard.com/2018/06/breaking-randomness-in-the-ethereum-universe-part-1/

So many books, so little time!