Rather than each partition having its own secondary index (local index). We can construct a global index that covers data in all partitions.
- 👉 However, we cannot just store that index on just one node!
- 👉 Since it will likely bottleneck and defeat the purpose of partitioning!
Global Index and term partitioning?
A global index must also be partitioned. But it can be partitioned differently from the primary key index.
Here is how this might work…
- Red cars from all partitions appear under the colour red in the index 🚗 🚗 🚗
- But the index is partitioned so that the colours starting with
- The letters A to R appear in partition 0
- The letters S to Z in partition 1
- The index on the make of car is partitioned similarly…
- With the partition boundary being F and H
We call this type of partition term partition.
- Because the term we are looking for determine the partition of the index
- Here the term would be the colour red for example
- The name term comes from full text indexes
- A particular kind of secondary index
- Where the terms are all the words that occur in a document
How to apply term partitioning?
As before we can partition the index by the term itself or using the hash of the term.
- Partitioning by term itself can be useful for range scans
- For example numeric property such as the asking price of the car
- Where as partitioning on the hash of the term this gives a more even distribution of load (as explain in earlier blog on hashing)
Term partitioning and range scans
Partitioning by the term itself can be useful for range scans..
- For example
- On a numeric property such as the asking price of the car
Where as partitioning on the hash of the term gives a more even distribution load…
Global term partition index vs Document partition index
The advantage of global term partition index over a document partitioned index…
- ✅ Is that it can make reads more efficient!
- ✅ Rather than doing scatter gather over all partitions
The client only needs to make a request to the partition containing the term that it wants…
However the downside of a global index is that…
- ❌ Writes are slower and more complicated
- ❌ Because a write to a single document may not affect multiple partitions of the index.
- ❌ Every term in the document might be on a different partition, on a different node! 🤯
In an ideal world the index would always be up to date…
- And every document written to the database would immediately be reflected in the index.
- 🤔 However, in a term partition index, that would require a distributed transaction across all partitions affected by a write
- ❌ Which is not supported by all databases!
Asynchronous global term partitioning
In practice update to global secondary indexes are often asynchronous.
⚠️ This means if you read the index shortly after a write, the change you just made may not be reflected in the index…
For example:
- Amazon DynamoDB
- States that its global secondary indexes are updated in a fraction of a second in normal circumstances…
- But may experience longer propagation delays in cases of fault in the infrastructure
Other uses of global term partition indexes:
📚 Further Reading & Related Topics
If you’re exploring partitioning and secondary indexes by term, these related articles will provide deeper insights:
• Understanding Partitioning Proportional to Nodes – Learn how partitioning strategies based on terms differ from proportional partitioning, and how they impact data distribution and access.
• How Does Partitioning Work When Requests Are Being Routed? – Explore how term-based partitioning and request routing influence system performance and data retrieval efficiency in distributed systems.









Leave a comment