What Is Sharding?
Sharding is a way networks- in this case, blockchains- can scale. It’s one of several well-known methods of database partitioning utilized with the aim of increasing scalability. With sharding implemented within the ecosystem, the entire network can process more transactions and support more users.
Solving scalability is essential because as blockchain technology goes mainstream, more people want or need to rely on this technology. But, as more users hop onto a network, the slower the network in question becomes.
Source: vitalik.ca
That said, one of the main objectives when it comes to blockchain development is reducing the workload and the strain it has on the network. So, using sharding or other methods for improving scalability is essential not only for blockchains but also for users who trust that the technology they use is secure and fast.
If Ethereum, for example, adopts sharding, this blockchain network would be split into multiple pieces, better known as shards, and stored in different places in the ecosystem. Still, this data portioning method only works when applied to a single database with a large dataset. This makes it an ideal solution for blockchain scalability issues.
If this seems confusing, let’s take a detour and look at it from another perspective.
Imagine you’re running a bookkeeping business and have many local and international clients who don’t ever make room for mistakes. Naturally, you’d hire as many professionals as needed to keep the financial records of your clients in order.
Now, imagine how much time it would take if your clients demanded from all of your employees, including yourself, to check each other’s output at the end of the shift. You would have to manage extra workload, especially if new clients arrive and demand to hire a reliable bookkeeper such as yourself.
In reality, if you have a bookkeeping business, you likely have trustworthy employees who don’t need to be micromanaged to do their job right. In other words, each employee knows what needs to be done, making room for efficiency in the workspace. And if you apply this analogy to our blockchain scalability issue, this is a semi-accurate reflection of what sharding is.
Photo illustration: Freepik
Sharding ensures that blockchain nodes, which we will explain in detail in a minute, don’t have to store a complete copy of a blockchain. Instead, shards store smaller and more manageable chunks of data to reduce latency. This ensures that the network can process more transactions per second.
This all seems like a good system that keeps everything in check.
However, that might have been true in the earliest stages of blockchain development. Nowadays, when more and more people are riding the Web3 wave and discovering new use cases and applications of blockchain technology, this data-storing technique is no longer a viable solution.
When each node existing within a blockchain ecosystem keeps track of the entire distributed ledger, the network suffers extreme latency and excess data. As a result, the nodes can’t simply propagate the required number of blocks and transactions within a short period of time, sometimes resulting in gas fees or other inconveniences.
How Does Sharding Work?
We’ve determined that sharding divides the data within the ecosystem to ensure the nodes don’t have to carry the burden of a network’s entire transactional demand and the data that comes with it. Instead, when a blockchain system is based on sharding, its nodes only keep data about its cluster of shards. It’s still debatable how many shards can coexist within a node.
Now, you can think of shards as smaller networks within one whole network. Each shard stores data that is entirely unique to it, meaning no shard has the same transactional data as other shards in the network.
This is a beautifully designed system. Now that a blockchain can be divided into smaller networks that, in a way, operate on their own, we don’t have to waste energy on creating excessive data. Furthermore, sharding ensures that shards depend on each other, as they only store fragments of data that don’t mean anything unless we pile them with all the other data bits and pieces from the blockchain.
Photo illustration: Freepik
So, now that various data sets exist across multiple databases, these databases don’t really have to be stored on the same machine. As a result, the network gets more flexibility and storage capacity, as this data segregation eliminates data repetition.
Another important detail is that if a blockchain adopts sharding as a means to improve scalability, the network won’t go through hard or soft forks to make it happen. In other words, sharding doesn’t make changes to the blockchain protocols. Instead, each shard that now exists on the said blockchain utilizes the same, or rather the original protocol, without initiating any changes.
Types of Blockchain Sharding
Not all sharding methods are the same. With that in mind, we can single out three sharding types: network sharding, transaction sharding, and state sharding.
Network sharding aims to improve node communication, resulting in synchronization between shards when transactions are processed. On the other hand, transaction sharding has more potential to accelerate transaction speeds within the network. Yet, state sharding would be the most secure option. It refers to the process of splitting the entire state of a network across node validators and shards.
Which Blockchains Use Sharding?
Well, none.
Not yet at least.
However, Ethereum is the leading sharding candidate, and is investing in research with the intention to fully embrace this scalability technique.
Still, these blockchain upgrades don’t happen overnight. I mentioned earlier that sharding isn’t equivalent to forks regarding the nature of upgrades. Still, in terms of development and testing, any significant changes to the blockchain take a considerable amount of time.
Source: Ethereum
So, although Ethereum, for example, aims to switch to sharding sometime this year, it’s likely that its developers will stumble upon new challenges.
Is that because sharding is still a concept, and the pioneers have to learn from their own mistakes?
Yes, and due to the fact that we should handle changes that could affect the entire system or a network with great care.
It’s also worth noting that sharding poses some security issues. When the workload is distributed across a system of shards, it makes room for a single-shard attack. If an attacker gains control over a shard, they could try to alter the code or do some damage from within the network. It could be that sharding isn’t a thing yet because developers haven’t found viable solutions to this security problem.
But, if sharding becomes a thing later this year, its possible that other blockchains and blockchain companies would invest more resources in research. That would help determine how to fully utilize this data distribution method.
How would that affect blockchain’s potential as we now know it?
We don’t know that yet, but hopefully we’ll get closer to answering this question soon.