What's the most powerful way to scale a database?
Most of the time, the first answer that comes to mind is sharding.
The idea is to split the data across several nodes, each taking care of a part of the whole.
However, sharding is not a free lunch because it makes the system design more complex.
In general, it's not clear what the benefits of sharding are until the size gets big enough.
For smaller apps, the extra complexity might not be worth it, and strategies like vertical scaling or read replicas may be better.
Once you decide to go for sharding, there are many things to take into account:
- which kind of sharing to use (i.e., range-based or key-based)
- how to choose the shard key
- how to rebalance partitions if the volume of data or requests changes
I wrote everything you should know about sharding here: