As discussed in the previous post What is Consistent Hashing?, hashing a key to determine its partition can aid in reducing hotspots…
However they cannot be avoided entirely…
In some extreme cases where all reads and writes offer the same key
- ⚠️ You still end up with all requests being routed to the same partition
This scenario of workload is potentially unusual, although this is not unheard of. For example:
- Social media site
- Celebrity user with millions of followers
- A storm of activity may spike when they do something
- This event can result in a large volume of writes to the same key!
- The key could be user ID of the celebrity
- Or..
- The ID of the action they are commenting on
- Hashing the key does not help!
- 👉 As the hash of two identical IDs is still the same (when thinking about partitions)
The application is responsible for reducing skew
“Today most data systems are not able to automatically compensate for highly skewed workload”
Martin Kleppmann – Designing Data Intensive Applications (2016)
Due to data systems struggle for auto-optimising skewed workloads, the responsibility of handling this leads to the application to reduce skew. 🤔
A technique to reducing skew
If one key is known to be very hot! 🔥
- The simple technique is to add a random number to the beginning or the end of the key
- ✅ Just a two digit decimal random number would split the writes to key evenly across 100 different keys
- ✅ Allowing those key to be distributed to different partitions
- However having to split the writes across different keys 👇
- Reads would need to do additional work
- As they would now have to read the data from all 100 keys and combine it! 🤯
- Reads would need to do additional work
This technique also requires a lot of book keeping.
- 👉 It only makes sense to append the random number for the small number of hotkeys, the vast majority of the keys with a low write throughput.
- Thus, you also need some way of keeping track of which keys are being split
- Perhaps in the future data system will be able to automatically detect and compensate for skewed workloads
- But for now you need to think about the tradeoffs for your own application
📚 Further Reading & Related Topics
If you’re exploring hotspot mitigation and performance tuning in distributed systems, these related articles will provide deeper insights:
• How Does Partitioning Work When Requests Are Being Routed? – Learn how partitioning strategies can help distribute workloads more evenly and prevent hotspots.
• Latency Optimization Techniques: Unlocking Performance with Lock-Free Programming, Memory Barriers, and Efficient Data Structures – Discover advanced techniques for optimizing system performance and reducing contention in high-load environments.









Leave a reply to Optimising Database Performance: The Cost of Opening Connections & Solutions – Scalable Human Blog Cancel reply