What is Partitioning By Key Range?

One method of partitioning is to assign a continuous range of keys, for some minimum to some maximum to each partition.

How are keys arranged?

The arrangement of keys are not necessary evenly spaced…

Because you data may not be evenly distributed
For example
- In an encyclopaedia volume 1 may contain words starting with A and B
- But volume 12 contains words starting with T U V X Y
- Simply having 1 volume per 2 letter of the alphabet will eventually lead you to some volumes being much bigger than others 👎

The partition boundaries need to adapt to the data

OR…

Certain access patterns can lead to hotspots.

If the key is a timestamp, then the keys correspond to ranges of time.
- For example 1 partition per day
Unfortunately, as we write data to the database as measurements happen…
- All the writes end up going to the same partition for the one for today!
So the partition can be overburdened with writes where others sit idle 👎

To avoid this problem in a sensor database.

You need something other an the timestamps as the first element of the key
- For example
  - You can prefix each timestamp with the sensor name and then by time ✅
  - 👉 Assuming you have many sensors active at the same time
    - The write load will end up more evenly spread across the partitions
Now when you want to collect multiple sensors within a given time range…
- You need to perform a range query for each sensor name ✅

📚 Further Reading & Related Topics

If you’re exploring partitioning by key range, these related articles will provide deeper insights:

• Understanding Partitioning: Proportional to Nodes – Learn about different partitioning strategies and how key-range partitioning helps manage load distribution and data access in distributed systems.

• Distributed Data-Intensive Systems: Sharding, Clustering, and Replication – Explore how partitioning by key range interacts with other data distribution strategies like sharding, replication, and clustering to optimize data consistency, availability, and performance.