What to Consider with Secondary Indexes and Partitioning

The partitioning schemes that have been covered in the previous blogs, rely on a key value data model.

If records are only ever accessed via a primary key…

  • We can determine the partition from that key ✅
  • And… use to route read/write requests to partition responsible for that key 👍

What are secondary indexes?

The situation becomes more complicated when secondary indexes are involved… 🤯

  • 👉 A secondary index usually does not identify a record uniquely
  • It is a way for searching for occurrences of a particular value ✅
    • Example
      • Find all events for user1
      • Find all articles containing the word “Warning
      • Find all cars that are red
      • Etc..

“Secondary indexes are the bread and butter of relational databases”

Designing Data Intensive Applications – Martin Kleppman

These are also common in document databases too.

❌ Many key value stores such as HBase and Voldemort have avoided secondary indexes, because of their added implementation and complexity.

✅ But some… such as Riak implement this as they are useful for data modelling.

The challenge with secondary indexes and partitioning…

The problem with secondary indexes is that they do not map neatly with partitions. 👎

Although… there are two main approaches to partitioning a database with secondary indexes, these are:

  • Document based partitioning
  • Term based partitioning

These approaches will be covered individually in the upcoming blog posts.

📚 Further Reading & Related Topics

If you’re exploring secondary indexes and partitioning in databases, these related articles will provide deeper insights:

• Understanding Partitioning Proportional to Nodes – Learn how partitioning strategies work in distributed systems and how they impact data access when using secondary indexes.

• What Is Consistent Hashing? – Explore how consistent hashing is used in partitioning and indexing strategies to ensure efficient data distribution and retrieval in large-scale systems.

2 responses to “What to Consider with Secondary Indexes and Partitioning”

  1. Partitioning Secondary Indexes by Term – What is it? – Scalable Human Avatar

    […] than each partition having its own secondary index (local index). We can construct a global index that covers data in all […]

    Like

  2. What is Partitioning and Why is it Important? – Scalable Human Avatar

    […] secondary index is just another means to accessing the record(s) you want without using a primary key. Secondary […]

    Like

Leave a comment

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect