What is Consistent Hashing?

Published by

Sean Mayer

29th May 2022

As defined by the author of Consistent Hashing and Random Trees, Distributed hashing protocols for Relieving Hotspots on the World Wide Web, consistent hashing is a way of evenly distributing load across an internet wide system of caches.

Example of this?

👉 Content delivery network (CDN)

“A content delivery network (CDN) is a globally distributed network of proxy servers, serving content from locations closer to the user. Generally, static files such as HTML/CSS/JS, photos, and videos are served from CDN, although some CDNs such as Amazon’s CloudFront support dynamic content. The site’s DNS resolution will tell clients which server to contact.”
https://github.com/donnemartin/system-design-primer

It uses randomly chosen partition boundaries to avoid central control

Or..

Distributed consensus!

Is this like Replication and ACID consistency?

⚠️ No – consistent hashing has nothing to do with replica consistency or ACID consistency.

What is it then?

✅ This rather describes the approach of rebalancing.

❌ This does not work well with databases so it is rarely used in practice…

🧐 Documentation of some databases still refer to consistent hashing… (but this is often inaccurate)

Better to avoid the term consistent hashing, call it hash partitioning instead.
Martin Kleppmann – Designing-Data Intensive Applications

Unfortunately however by using the hash of key for partitioning purposes, subsequently we lose the useful property of key range partitioning. 👎 (The ability to do efficient range queries)

Keys that were once adjacent are now scattered across all the partitions.

So there sort order is lost…
In MongoDB if you have enabled hash based shardingmode
- Any range query has to be sent to all partitions
Range keys on the primary are not supported by Riak, Couchbase or Voldemort
Cassandra achieves a compromise between two partitioning strategies
- Table in Cassandra can be declared with a compound primary key consisting of several columns
- Only the first part of that key is hashed to determine the partition
  - But other columns are used as a concatenated index for sorting the data in Cassandra‘s SSTables
  - A query cannot search for a range of values within the first column of a compound key
  - But if it specifies a fixed value for the first column it can perform an efficient range scan over the other columns of the key 👍

Why to consider the Concatenated Index Approach?

This enables an elegant data model for one-to-many relationships. ✅

For example
- On a social media site, one user may post may updates
- If a primary key for the updates is chosen to be a:
  - User ID
  - Update timestamp
- Then you can efficiently retrieve all updates made by particular user within some time interval (sorted by timestamp)
- Different users maybe stored on different partitions 🤔
  - But.. within each user updates are stored ordered by timestamp on a single partition 😊

📚 Further Reading & Related Topics

If you’re exploring consistent hashing and its role in distributed systems, these related articles will provide deeper insights:

• Understanding Partitioning Proportional to Nodes – Learn how partitioning strategies work in distributed systems and how consistent hashing improves scalability.

• How Does Partitioning Work When Requests Are Being Routed? – Explore how different partitioning mechanisms impact load distribution and system efficiency.

5 responses to “What is Consistent Hashing?”

How to Relieve Hotspots with Skewed Workloads – Scalable Human

5th June 2022

[…] discussed in the previous post What is Consistent Hashing?, hashing a key to determine its partition can aid in reducing […]

LikeLike

Reply
Understanding Partitioning Proportional to Nodes – Scalable Human

6th August 2022

[…] this approach corresponds most closely with the original definition of consistent hashing (mentioned earlier in the blog). New hash functions can achieve a similar affect with lower […]

LikeLike

Reply
What is Partitioning and Why is it Important? – Scalable Human

4th September 2022

[…] ✅ For example a compound key […]

LikeLike

Reply
What to Consider with Secondary Indexes and Partitioning – Scalable Human Blog

10th March 2025

[…] • What Is Consistent Hashing? – Explore how consistent hashing is used in partitioning and indexing strategies to ensure efficient data distribution and retrieval in large-scale systems. […]

LikeLike

Reply
Load shedding In Software Systems – Scalable Human Blog

25th September 2025

[…] uneven load distribution across systems, a key challenge addressed by load shedding strategies. • What Is Consistent Hashing? – Learn how consistent hashing helps distribute requests more evenly across nodes, minimizing […]

LikeLike

Reply

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect

Join the mailing list

Stay updated with latest posts.

Scalable Human Blog

What is Consistent Hashing?

Example of this?