What are the Limitations of Quorum Consistency?

Quorum Variables (before diving into this subject)

  • W = Minimum write nodes
  • R = Minimum read nodes
  • N = Nodes in quorum group

(W + R) > N

If you have N replicas and you choose W and R such as (W+R) > N you will generally expect every read to return the most recent value written.

  • This is the case because the set nodes for once you have written and the set of nodes for which you have read must overlap.
  • That is among the nodes you read, there must be at least one node with the latest value
  • Often R and W are chosen to be the majority of nodes
    • More than N over 2
      • This shows the quorum condition is met
      • Whilst still tolerating N over 2 node failures

Pros

  • Less likely to incur stale reads (more accurate reads)
  • Utilising more nodes (not exclusive to this) will allow you to withstand node outages

Cons

  • Can impact high availability as the system requires more successful responses from W and therefore we can expect increase latency as there are more responses to be confirmed
  • Edge cases may occur still where we find stale reads
(R + W) > N doodle also show node outages

(W + R) < N

But Quorums are not simply majorities, it is only matters if the sets of nodes used by read write operations overlap at least one node.

  • Other quorum assignments are possible…
    • That allow flexibility on the design of distributed algorithms

You may even want W and R to be smaller numbers!

  • (W+R) < N
    • In which case the quorum condition is not satisfied
    • Here read and writes will still be set to N nodes
      • But a smaller number of successful responses is required for the operations to succeed
      • With smaller W and R you are more likely to read stale values
      • Because it is more likely that your read did not include the node with the latest value
      • On the upside this configuration allows lower latency and higher availability
        • If there is a network interruption and many replicas become unreachable, there is a higher chance you can continue to process reads and writes
        • Only after the number of reachable replicas falls below W or R does the database become unavailable for writing and reading respectively

Pros

  • Lower latency
  • Higher availability

Cons

  • Stale reads are more likely
(R + W) < N Doodle

Edge Cases

These depend on the implementations.

  • Possible scenarios
    • If a sloppy quorum is used
    • The W writes may end up on different nodes than the R reads
    • So there is no longer a guarantee the overlap between R nodes and W nodes
    • If there are two writes that happen concurrently
      • It is not clear which one happens first
      • The only safe solution is to merge the concurrent writes
      • If a winner is picked on the timestamp (the last write wins)
      • Writes can be lost due to clock skew
    • If write happens concurrently with a read, a write may be reflected on the only some of the replicas
      • In this case it is undetermined whether the read returns the old or new value
      • If a write succeeded on some replicas but failed on others
        • Example because the discs on some nodes are full and the overall succeed on fewer than W replicas it is not rolled back on the replicas where it succeeded
          • This means if a write is reported as failed
          • Subsequent reads may or may not return that value from the write
        • If a node carrying a new value fails and its data is restored from an old replica carrying a value
          • The number of replicas storing new value may fall below W breaking the quorum condition
          • Even if everything is working correctly, there are edge cases where you can get unlucky with timing

Final note

Thus, although quorums may appear to guarantee that reads return the latest written value…

  • In practice it is not so simple
  • Dynamo styled databases are generally optimised for uses cases to tolerate eventual consistency
  • The parameters W and R allow you to adjust the probability of value being read
    • But do not take them as an absolute guarantee
    • In particular we do not usually get the guarantees problems with replication lag
    • So the previous mentioned anomalies
      • Reading then writes
      • Monatonic reads
      • Consistent prefix reads
    • Can still occur in applications…
    • Stronger guarantees generally require transactions and consensus!

One thought on “What are the Limitations of Quorum Consistency?

Leave a comment