What is Dynamic Partitioning?

For database that use key range partitioning, a fixed number of partitions with fixed boundaries would be very inconvenient!

  • If you got the boundaries wrong ❌
    • You could end up with all the data in one partition and all the other partitions empty 👎
    • Reconfiguring the partition boundaries manually would be very tedious👎

For that reason… Key range partition databases such as HBase and RethinkDB create partitions dynamically! 👍

  • When a partition grows to exceed a configured size (on HBase the default is 10GB)
    • It is split into two partitions
    • So approximately half the data ends up on each side of the split
    • Conversely, if lots of data is deleted and partition shrinks below some threshold it can be merged with the adjacent partition
  • This process is similar to what happens at the tope level of a B-Tree
    • Each partition is assigned to one node
    • And each node can handle multiple partitions
    • Like in a case of fixed number of partitions
    • After a large partition has been split, one of its two halves can be transferred to another node in order to balance the load
  • In case of HBase

Advantage of Dynamic Partitioning

Is that the number of partitions adapts to the total data volume ✅

  • If there is only a small amount of data, a small number of partitions is sufficient, so overheads are small.
  • If there is a huge amount of data, the size of each individual partition is limited to a configurable maximum.

However a caveat is…

  • That an empty database starts of with a single partition
  • Since there is no information where to draw the boundaries, while the dataset is small
  • Until it hits the point of which it hits the first partition is split
  • All writes have to be processed by a single node, while other nodes sit idle…

👉 To mitigate this issue, HBase and MongoDB allow an initial set of partitions to be configured on an empty database. This is called pre-splitting.

Pre-Splitting

In case of key range partitioning pre-splitting requires that you already know what the key distribution already looks like.

Can equally be used with hash partitioned data. MongoDB (since v2.4) supports both key range and hash partitioning and it split partitions dynamically for either case.

📚 Further Reading & Related Topics

If you’re exploring dynamic partitioning in distributed systems and databases, these related articles will provide deeper insights:

• How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies affect request distribution, scalability, and performance in distributed environments.

• How to Relieve Hotspots with Skewed Workloads – Discover techniques for balancing uneven data loads, a key challenge that dynamic partitioning helps address.

3 responses to “What is Dynamic Partitioning?”

  1. Understanding Partitioning Proportional to Nodes – Scalable Human Avatar

    […] dynamic partitioning the number of partitions is proportional to the size of the […]

    Like

  2. What is Partitioning and Why is it Important? – Scalable Human Avatar

    […] Dynamic partitioning can be used, and hybrid approaches are also possible! […]

    Like

  3. How to Partition with Secondary Indexes by Document – Scalable Human Blog Avatar

    […] • What is Dynamic Partitioning? – Learn how dynamic partitioning optimizes data distribution and indexing in document-based databases. […]

    Like

Leave a reply to Understanding Partitioning Proportional to Nodes – Scalable Human Cancel reply

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect