Recently, at one of our yearly hands-on company events, we took on the challenge of implementing our own blockchain in various programming languages. One of the core challenges of this is the mining of new blocks. In this blog post, I want to show two different approaches we tried out for this particular task, using the Rust programming language, and discuss some interesting findings. If you want to read more about our blockchain hands-on event, there is a German article about it.
The task at hand: mining a block
Mining a new block was supposed to be done by means of a proof-of-work algorithm. Let’s first look at the JSON representation of what a block looks like:
As you can see, a block candidate contains an index, which denotes the position in the chain, a timestamp, a list of transactions, and a reference to the hash of the previous block. The list of transactions is basically the payload of our block, but what it exactly looks like is not relevant to our problem.
In addition to all those other fields, a block candidate contains a numerical field called
proof. Typically, the task of mining a new block is to find a hash of the block that starts with a specific number of leading zeros. Specifically, in our case, a block is valid if the SHA256 hash of the JSON representation (using the order of fields shown above) starts with six zeros, i.e. with
How do we find such a block? All the fields of the block are fixed, apart from
proof. So the goal is to find a value for this field that leads to a hash starting with
000000 when the JSON representation of the whole block is hashed.
Representing a block in Rust
In order to solve this problem in Rust, we first need to come up with appropriate representations of a block and a transaction. Since we do not want to do the JSON serialization ourselves, we are going to use serde, serde_json, and serde_derive.
Serde is a popular serialization library for Rust, supporting multiple formats, including JSON, and providing macros to automatically derive the necessary implementations of its
Deserialize traits using Rust’s built-in
That being said, here are the two Rust structs representing a
Transaction and a
The genesis block
A blockchain always needs a specific genesis block, which is the very first block in the chain. This genesis block is predefined and needs to be hardcoded into our blockchain implementation. We define a function returning the genesis block like so:
Validating a block candidate
In order to find a valid block, we need to be able to create a JSON representation of a block candidate. We’ll add a method to
Block that does just that and delegates to the functionality provided by
The JSON serialization may fail, but we want to fail miserably if that happens in our simple application, so we call
unwrap on the
Result returned by
We also need to be able to compute a SHA256 hash of a block candidate’s JSON representation. To do that, we make use of the crypto-hash crate. We define a simple function that takes an immutable reference to a
Block and returns a new
Note that we are able to call
to_json on our
Block reference, since we defined
to_json as a method above, taking
&self as its first and only parameter. The
hex_digest function and the
Algorithm enum are imported from the
Let’s also create a function that checks whether a given hash is valid:
valid function expects a hash and a prefix to check for, both not of type
String, which owns its content, but
&str, whose content is borrowed. This is fine, as we don’t want do take ownership ourselves, or mutate the
prefix in any way.
Before we can use any of those functions, though, we first need to be able to create a candidate from a timestamp, a sequence of transactions, and the previous block. We define a constructor function that returns a new
Unlike the transactions, which need to be owned by the new
Block, the previous block is expected as an immutable reference, because we only want to read that block’s index and compute its hash.
Two ways of mining
Now that we have the basic machinery for mining a new block in place, let’s look at a few different approaches for implementing the actual mining in Rust.
The mutable mining mystery
I am a big fan of functional programming. That being said, one of the things I find so intriguing about Rust is how it helps you to deal with mutable state without going insane – the compiler is going out of its way to help you by preventing you from making certain mistakes that would lead to disastrous errors at runtime in most other languages.
That’s why the very first attempt at mining a new block that we chose to implement at our hands-on event was based on mutable state. Here is what it looks like:
block: &mut Block in our parameter list, we say that we want to mutably borrow a reference to a
Block, which means that we are allowed to mutate it in our function. The implementation of our mining function is then based on a cornerstone of imperative programming that has fallen into oblivion among functional programmers a long time ago: the while loop. As long as the hash of the block is not valid, we keep incrementing the block’s proof. This is the simplest possible way of finding a proof leading to a valid hash. Other approaches are conceivable as well. For instance, it would be perfectly possible to choose a random value for the proof instead of incrementing it. However, randomly chosen proofs is an approach we didn’t pursue at all at our hands-on event.
When we call this function, we need to pass in a mutable reference to a
Block candidate. The function will not return anything when it is done. Instead, we can inspect the mutated
Block that we passed in, for example to see the proof that led to a valid hash:
This will print the value
5518788 to stdout after a while.
The iterative identification inscrutability
It was pretty straightforward to implement the imperative solution to the mining problem using mutable state. At heart, though, I am still a functional programmer, so I always prefer a more functional and declarative solution that is more about the what, and less about the how. The great thing about Rust that very often, you have a choice. Using Rust’s powerful iterator library, we can write a purely functional solution like this:
The signature of this function is a bit different from the previous one. This time, we want to borrow a
Block reference immutably, and we return a completely new
The implementation is quite interesting: Using the
0.. notation, we create a half-open range starting at
Range implements the
Iterator trait, so that we can use all the available iterator adapters, including the
map adapter. Specifically, we map from a number, which is our proof, to a new
Block. Note that we clone the
transactions and the
previous_block_hash from the passed in block, because our new
Block needs to own them.
So far, we have an infinite iterator, because it’s based on a half-open range. We consume this iterator by means of the
find method, which expects a predicate, a function from
bool. Our predicate checks whether the block is valid, using the functions we defined above. The
find method returns an
Option<Block> because in general, the predicate may never be fulfilled for any item. However, our assumption in this example application is that at some point, there will always be a proof value leading to a valid hash, so we simply unwrap the option. If the option is actually a
None, our application will crash.
Beautiful benchmarking business
The functional and immutable solution looks pretty good. However, we actually had to write a lot more lines of code than in the imperative approach, mainly because in order to stay immutable, we had to create a new
Block instead of mutating the existing one. This means we need to create a new
Block on the heap for every proof we try, and because a
Block owns the transactions and the hash of the previous block, we need to clone those from the passed in
It’s certainly interesting to see how the two approaches compare when it comes to performance – will the mutable version be a lot faster than the functional one?
I wrote a small function that helps to do some crude benchmarking. Rust nightly has built-in support for benchmarking, but we want to run on stable (1.26.2), and there are some limitations to the built-in benchmarking that make it unsuitable for our use case.
For our own benchmarking functionality, we build on the histogram crate. Here is our benchmarking function:
It expects a label and a closure that can safely be executed repeatedly. It will create a new
Histogram and execute the passed in closure multiple times. Right now, it’s hard-coded to 10 times, which is not ideal to get reliable statistics, but since our mining algorithm takes quite a long time, I decided not to execute it more often. Each execution of the closure is timed, and the resulting value added to the histogram. At the end, a bunch of interesting values are printed to stdout, like the mean, median, max, and min execution time, as well as the standard deviation.
Now we can write some benchmarking code to compare the two implementations:
cargo run --release in order to run a release build, which makes the compiler use a lot of optimizations that are not used in a development build. Here is the result on my MacBook Pro (Retina, 13-inch, Late 2012, 2,9 GHz Intel Core i7, four cores):
I already knew that the Rust compiler is supposed to do all kinds of optimizations when using iterators. Nevertheless, I was still surprised to see that the two implementations are more or less on the same level – in fact, the functional, immutable implementation is even a tiny bit faster than the imperative one. Based on the stats provided by the OS X Activity Monitor, the memory usage is about 25% higher, which is likely due to the additional heap allocations we did in order to stay immutable.
Implementing the block mining functionality in Rust is pretty straightforward. In this blog post, we saw that, for some CPU-bound tasks, functional implementations can be just as fast as imperative ones that are based on mutable state, because the Rust compiler is remarkably good at optimizing that stuff for you.
Our mining algorithm is pretty naive. Some obvious optimizations were deliberately left out. For example, it’s possible to only serialize a block to JSON once and only replace the value of the proof field in the resulting
Another limitation of our two implementations is that they are both leveraging only a single core. Mining a new block, however, is a task that can probably be classified as embarrassingly parallel. In the follow-up to this blog post, we will look at three alternative implementations that utilize all the machine’s cores.