Partitioning Secondary Indexes by Term – What is it?

Rather than each partition having its own secondary index (local index). We can construct a global index that covers data in all partitions.

  • 👉 However, we cannot just store that index on just one node!
  • 👉 Since it will likely bottleneck and defeat the purpose of partitioning!

Global Index and term partitioning?

A global index must also be partitioned. But it can be partitioned differently from the primary key index.

Here is how this might work…

  • Red cars from all partitions appear under the colour red in the index 🚗 🚗 🚗
  • But the index is partitioned so that the colours starting with
    • The letters A to R appear in partition 0
    • The letters S to Z in partition 1
  • The index on the make of car is partitioned similarly…
    • With the partition boundary being F and H

We call this type of partition term partition.

  • Because the term we are looking for determine the partition of the index
  • Here the term would be the colour red for example
  • The name term comes from full text indexes
    • A particular kind of secondary index
    • Where the terms are all the words that occur in a document

How to apply term partitioning?

As before we can partition the index by the term itself or using the hash of the term.

  • Partitioning by term itself can be useful for range scans
    • For example numeric property such as the asking price of the car
  • Where as partitioning on the hash of the term this gives a more even distribution of load (as explain in earlier blog on hashing)

Term partitioning and range scans

Partitioning by the term itself can be useful for range scans..

  • For example
    • On a numeric property such as the asking price of the car

Where as partitioning on the hash of the term gives a more even distribution load…

Global term partition index vs Document partition index

The advantage of global term partition index over a document partitioned index…

  • ✅ Is that it can make reads more efficient!
  • ✅ Rather than doing scatter gather over all partitions

The client only needs to make a request to the partition containing the term that it wants…

However the downside of a global index is that…

  • ❌ Writes are slower and more complicated
  • ❌ Because a write to a single document may not affect multiple partitions of the index.
  • ❌ Every term in the document might be on a different partition, on a different node! 🤯

In an ideal world the index would always be up to date…

  • And every document written to the database would immediately be reflected in the index.
  • 🤔 However, in a term partition index, that would require a distributed transaction across all partitions affected by a write
    • ❌ Which is not supported by all databases!

Asynchronous global term partitioning

In practice update to global secondary indexes are often asynchronous.

⚠️ This means if you read the index shortly after a write, the change you just made may not be reflected in the index…

For example:

  • Amazon DynamoDB
    • States that its global secondary indexes are updated in a fraction of a second in normal circumstances…
    • But may experience longer propagation delays in cases of fault in the infrastructure

Other uses of global term partition indexes:

  • Riak search feature
  • Oracle data warehouse
    • Which lets you choose between local and global indexing

📚 Further Reading & Related Topics

If you’re exploring partitioning and secondary indexes by term, these related articles will provide deeper insights:

• Understanding Partitioning Proportional to Nodes – Learn how partitioning strategies based on terms differ from proportional partitioning, and how they impact data distribution and access.

• How Does Partitioning Work When Requests Are Being Routed? – Explore how term-based partitioning and request routing influence system performance and data retrieval efficiency in distributed systems.

One response to “Partitioning Secondary Indexes by Term – What is it?”

  1. What is Partitioning and Why is it Important? – Scalable Human Avatar

Leave a comment

I’m Sean

Welcome to the Scalable Human blog. Just a software engineer writing about algo trading, AI, and books. I learn in public, use AI tools extensively, and share what works. Educational purposes only – not financial advice.

Let’s connect